Table of Contents

1 Introduction

The 3M model specifications discussed in the article provides a basic framework for meta-analysis of L2 studies when each (or a good number of them) produces multiple effect size estimates.

The assumptions of this basic 3M specification (i.e., structure of true effects) can be made more detailed to create more flexible 3M models. In this document, we briefly discuss some of these additional possibilities using the same software and data that was used in the article.

2 Time as a level with compound symmetry (CS)

Recall that the 3M model discussed in the manuscript had its true effects (or equivalently random-effects) structured such that true effects were nested in studies. However, one could argue that just like the assumption that true effects are more similar in each study than those in other studies, true effects at each time point in each study are more similar than those at other time points in that study (time as a level). This new source of similarilty, leads to correlation among the true effects across all pairs of time points in each study. As a starting point, we may assume a constant correlation among the true effects across all pairs of time points as well as a common source of random variation for the time-level true effects (i.e., Compound Symmetric time-level effects).

This would mean that the random-effects structure we described in the article can go from:

random = ~1 | study/effect                                                            (1)

to:

random = ~1 | study/time/effect                                                       (2)

The additional time level would then turn our original 3-level model to a 4-level model. For purposes of describing such a model, it is customary to refer to the studies’ average true effects as level 1 (highest level), the underlying true effects averaged at each time point in a study as level 2, the individual true effects at each time point in a study as level 3, and the individual effect size estimates as obtained from the literature at each time point in a study as level 4 (lowest level). We will visualize this 4-level model and contrast it with the 3-level model discussed in the article in the next section.

2.1 Time as a level visualization

In visual terms, instead of seeing the true effects in each study (e.g., Brown, 2017) structured as:

we are going to see the true effects in each study structured as:

The same random-effects structure (i.e., Compound Symmetric time-level effects) could also be reparameterized such that we could readily see the estimated constant correlation among the true effects across all pairs of the time points in our output:

random = list(~time|study, ~1|effect)                                                 (3)

2.2 Model fitting (CS)

We can now update our original m_prof_new (see the article) model with the random-effects structure defined in (3):

m_prof_CS <- update(m_prof_new, random = list(~time|study, ~1|effect))

The estimated random-effects results can be extracted to examine the estimated constant correlation among the true effects across all pairs of the time points, as well as the estimated random source of variation for the time-level true effects:

results_rma(m_prof_CS, random_only = TRUE)

2.3 Model interpretation (CS)

In the table above, \(\sigma\) represents the estimated random variation (in sd unit) in individual true effects at a given time point (pre- or post-test) in a given study. \(\tau\) represents the estimated random variation in the true effects obtained across the time points (i.e., time-level true effects). And \(\rho\) represents the estimated constant correlation in the true effects obtained across all pairs of the time points.

3 Time as a level with heteroscedastic compound symmetry (HCS)

A yet more complex model could be fit by letting separate sources of random variation in the true effects across the time points (allowing heteroscedasticity), while leaving their correlation to be constant (heteroscedastic compound symmetry; HCS) across all pairs of those time points.

3.1 Model fitting (HCS)

We can again update our original m_prof_new model with the HCS random-effects structure:

m_prof_HCS <- update(m_prof_new, random = list(~time|study, ~1|effect), struct="HCS")

Once again, the random-effects can be extracted to examine the estimated constant correlation among the true effects across all pairs of the time points, as well as the separate estimated random sources of variation for the time-level true effects:

results_rma(m_prof_HCS, random_only = TRUE)

3.2 Model interpretation (HCS)

In the table above, \(\sigma\) represents the estimated source of variation (in sd unit) in individual true effects at a given time point (pre- or post-test) in a given study. However, \(\tau\) is now a separately estimated source of variation in the true effects at (rather than across) each time point. And \(\rho\) represents the estimated correlation in the true effects obtained across all pairs of the time points.

3.3 Model checking (HCS)

Note that the estimated \(\rho\) is \(1\) (the maximum possible value for a Pearson correlation coefficient). It is often a good idea to run a quick check on a model with such an extreme result (but also all other results) to ensure that such exterme results are correctly estimated by the model.

To do so, we usually run a profile of all possible estimates of \(\rho\)’s likelihood values. If the extreme result (in our case \(\rho=1\)) happens to have the highest likelihood among them, then we can rest assured that such extreme result didn’t arise from some kind of glitches during the estimation process:

(rho= 1 tells profile() that we only want to check the rho estimate not other estimates in the model.)

profile(m_prof_HCS, rho= 1)

Thankfully, in our case, \(\rho=1\) does happen to have the highest likelihood among all other possible values for \(\rho\) (from -1 to 1). If this was not the case, then we would see some form of flatness in the plot above.

4 Time as a level with unstructured true effects (UN)

A yet more complex model could be fit by letting separate sources of random variation in true effects across the time points (allowing heteroscedasticity), but also letting their correlation to be separately estimated across each pair of time points (Unstructured structure).

4.1 Model fitting (UN)

We can insert this assumption into our model by:

m_prof_UN <- update(m_prof_new, random = list(~ time | study, ~1|effect), struct = "UN")

The estimated random-effects results can be extracted to examine the results.

results_rma(m_prof_UN, random_only = TRUE)

4.2 Model interpretation (UN)

In the table above, \(\sigma\) represents the estimated random variation (in sd unit) in individual true effects at a given time point (pre- or post-test) in a given study. Just like the HCS model, \(\tau\) is now separately estimated for the sources of random variation in the true effects at each time point. However, \(\rho\) is now separately estimated for the correlation in the true effects across each pair of time points.

5 Time as a level with autoregressively correlated true effects (AR)

Given the longitudinal nature of the WCF studies, another possible assumption might be that true effects of WCF are more strongly correlated between adjacent time points than those further apart.

5.1 Model fitting (AR)

We can encode this assumption into our model by:

m_prof_AR <- update(m_prof_new, random = list(~ time | study, ~1|effect), struct = "AR")

The estimated random-effects results can be extracted to examine the results.

results_rma(m_prof_AR, random_only = TRUE)

5.2 Model interpretation (AR)

In the table above, \(\sigma\) represents the estimated random variation (in sd unit) in individual true effects at a given time point (pre- or post-test) in a given study. Just like the CS model, \(\tau\) represents the estimated random variation in the true effects across the time points. However, The idea is that adjacent time points are correlated with each other as multiples of \(\rho\) (the estimated \(\rho\) in the table above). As the time points move a step away from each other, their correlations changes (i.e., weakens) by \(\rho^{step~+~1}\).

For instance, the true effects obtained at any two adjacet time points (e.g., baseline ~ post1) in the WCF literature are correlated as \(\rho\) (\(.643\)), but the true effects obtained at the time points one step away from each other (baseline ~ post2) are correlated as \(\rho^2\) (\(.643^2=.413\)).

6 Time as a level with heteroscedastic and autoregressively correlated true effects (HAR)

Similar to AR, we can also allow for heteroscedastic random variation across the true effect of WCF obtained across different time points (HAR structure).

6.1 Model fitting (HAR)

We can encode this assumption into our model by:

m_prof_HAR <- update(m_prof_new, random = list(~ time | study, ~1|effect), struct = "HAR")

The estimated random-effects results can be extracted to examine the results.

results_rma(m_prof_HAR, random_only = TRUE)

6.2 Model interpretation (HAR)

In the table above, \(\sigma\) represents the estimated random variation (in sd unit) in individual true effects at a given time point (pre- or post-test) in a given study. Also everything said about \(\rho\) in AR structure applies to the \(\rho\) estimated in HAR structure as well. However, just like HCS, we are letting separate sources of random variation in true effects across the time points (allowing heteroscedasticity). These estimated separate sources of random variation are labeled as \(\tau\) in the table above.

7 Model Selection

At this point, a key question is that which set of the theoretical assumptions put into the models above provides a better empirical match to the meta-analytic impact of proficiency (prof) on WCF’s effectiveness?

To answer this question, we can estimate the relative weight/probability (in percentages) that each of them can be the best fitting given the other ones. To do so, we can use the AICctab() function from the bbmle package:

library(bbmle)

AICctab(m_prof_new, m_prof_CS, m_prof_HCS, m_prof_UN, m_prof_AR,
        m_prof_HAR, weights=TRUE, base=TRUE)

The AICctab() orders the model based on their weights. In our case, it does seem that m_prof_HAR and m_prof_HCS are both similarly best-fitting models. Thus, inferences drawn from any of these models will likely enjoy better conclusion validity relative to other candidate models.

8 Conclusion

Almost all other post-model analyses (post_rma()), outlier detection (interactive_outlier()) and visualizations (plot_rma()) discussed in detail in the article seamlessly work with these more advanced models as well.

However, there is still much more flexibility to accommodate a wide range of other meta-analytic situations, research questions, and study complexities that the 3M framework can handle. These, of course, require much more space and background. As a result, we hope to introduce those capabilities to the field in the near future.