In Note 3 in the manuscript, we noted the fact that our UTOS moderators specifically apply to paths a and c, but not b (see Figure 1). In our study, we have ten moderators (i.e., orthographic script distance, region, note-taking option, notes-taking type, material type, measure, input type, learners’ proficiency, learning target, and time). Some moderators might directly affect learners’ note-taking behavior when learners are exposed to the L2 input. For example, learners’ L1-L2 orthographic distance may affect the ease with which learners can understand the input (Zhang & Zhang, 2020), and in turn affect their ability to take notes. Similarly, the different regions where the study was conducted might also influence learners’ note-taking perceptions and habits (Siegel & Kusumoto, 2022). Note-taking options (i.e., whether learners are required or allowed to take notes), note-taking instruction (i.e., whether or not learners are provided with any note-taking instruction), and note-taking types can affect the effectiveness of note taking to a certain degree given their ability to engage or re-direct students’ attention to various aspects of input (Siegel, 2021).
Some other moderators might also affect note taking. For instance, a learner with a higher proficiency level might be more easily able to identify information when encountering input and might be more motivated to take notes, thereby enhancing the efficiency and effectiveness of the note-taking process. Also, the type of input in which information is presented to learners, whether it is in written or aural input, might influence how learners take notes. The nature of the material itself might also affect learners’ note-taking behavior. Academic input, which might be more complex and in-depth compared to non-academic input, might pose a challenge for note taking to take place (Jin & Webb, 2023). The effect of note taking may also vary depending on the measure types. Measuring learning outcomes via recognition tests (e.g., multiple-choice items) or recall tests (e.g., writing the meaning of a given word or the L2 word that corresponds to a given meaning) may require different depths of processing, which in turn can influence (i.e., moderate) the effect of note taking. Another moderator, learning outcome, might also affect note taking. These learning outcomes might guide learners on what to focus on when receiving input. For instance, when the learning outcome is reading comprehension, the notes might be broader in scope (e.g., targeting the content). However, when the learning outcome is vocabulary learning, the notes might be narrower in scope (e.g., targeting the keywords). The moderator time (i.e., outcome measurement timing) was added to this meta-analysis to differentiate between learners’ pre- versus post-treatment learning outcomes and thereby to measure the possible gains (i.e., difference between pre- and post-tests) from note-taking as a learning aid.
As can be seen, all of our substantive UTOS variables can potentially moderate the act of note taking (path a) and/or the processing of input (path c), and thus, do not, by definition, apply to path b directly. Finally as noted in the manuscript, our M moderators which by themselves “do not necessarily merit an interpretation [were all] adjusted for in the background” (Norouzian & Bui, 2024, p. 16), so the impact of the substantive UTOS variables can be more clearly examined.
Figure 1. Theoretical framework for note taking
As noted in the manuscript, we included 10 substantive (UTOS) and 3 additional methodological (M) moderators in our study. The following table provides a detailed description of the considerations involved in excluding certain substantive moderators from Jin & Webb (2023). Please see the methodology section in the manuscript for the full description of our moderators.
Moderators in Jin & Webb (2023) | Exclusion Reason |
---|---|
Context | Lack of data on second language context at baseline which was needed to measure the learning gains. |
Institutional level | In Jin & Webb (2023), this variable was used as a proxy for proficiency. In our study, we used the contextual evidence (e.g., TOEFL scores, institutional proficiency requirements, the language of instruction requirements) in the primary studies to create 3 broad categories for proficiency (i.e., beginner to lower intermediate, intermediate, high intermediate to advanced) to more directly measure the influence of proficiency. As a validation step, we examined the average effect sizes at the pre-test (before the treatment introduction) for these 3 proficiency groups and confirmed that learners at the "beginner to lower intermediate" level had the lowest outcome knowledge, followed by "intermediate", and "high intermediate to advanced" learners with higher levels of outcome knowledge, respectively. |
Provision of note-taking strategy instruction | The moderator "provision of note-taking instruction" had two quite broad categories: (1) those that involved conventional note taking without specific instruction, and (2) all others that involved some forms of instructed or structured note taking. As a result, knowing the note taking type determined exactly whether note-taking instruction was provided or not. In other words, a meta-analytic comparison for note taking types included the information for a meta-analytic comparison for note-taking instruction provision, and thus the latter was redundant. |
Opportunity to review notes | Insufficient data (i.e., only 3 effects from 2 studies) at the baseline for the learners that did not have the opportunity to review the notes invalidating the measurement of learning gains for this variable. |
Number of notetaking sessions | Paralleled the number of note-taking treatments as one of the methodological variables included in this study. |
Note taking instruction length | Because note taking instruction, by definition, was not applicable to one of the note-taking types (i.e., conventional note-taking), analyzing this moderator (i.e., instruction length) would have either: (a) restricted our sample size by requiring to code the conventional note-taking effects NA/blank or (b) led to invalid results by requiring to artificially code the conventional note-taking effects 0. Additionally, we added a new moderator that was applicable to all note-taking types and that measured the length of each study from the first to the last measure, which was controlled for as a methodological variable that differentiated the studies to varying degrees (see Table 2 in the manuscript). |
The execution of these initial analyses may be time consuming. Unless
otherwise needed, we suggest that readers instead run the analyses in the next section which in reality use the results of
these initial analyses. Additionally, to better understand the variables
involved in the initial analyses (e.g., those used for estimating
Hedges’ g effect sizes), the next Table
provides a list of their names and definitions.
(Click on
Code
on the bottom right for the reproducible codes used in
each section).
# We use Software introduced by Norouzian & Bui (2024)
source("https://t.ly/olaQ0")
# We also use the following R package for choosing the best candidate model
library(bbmle)
# Raw coding sheet with merged first row and lots of empty cells
dat <- read.csv("https://t.ly/i5aYY", na=c(NA,"","NA","NULL"))
# Remove the merged first row but use the first row to rename the column names
dat2 <- setNames(dat[-1,], dat[1,])
# Remove any accidental spaces or empty rows or columns
dat3 <- full_clean(dat2)
# Make sure each column's data type is correctly recorded
dat4 <- type.convert(dat3, as.is=TRUE)
# Compute effect sizes
dat5 <- escalc("SMD", m1i = mT, m2i = mC, sd1i = sdT, sd2i = sdC,
n1i = nT, n2i = nC, data = dat4, var.names = c("g", "v_g"))
# Adjust for assignment by intact classes
dat6 <- group_by(dat5, study) %>%
mutate(
g2 = ifelse(assign_type=="class", g_cluster(g, n_class, Nt, Nc), g),
v_g2 = ifelse(assign_type=="class", g_vi_cluster(g, n_class, Nt, Nc), v_g),
SE_egger = sqrt((nT + nC) / (nT * nC)),
time = recode(time, "pretest" = "baseline"),
region = recode(region, "Asia" = "East Asia") # reviewer requested changing Asia to East Asia
) %>% ungroup() %>%
mutate(effect = row_number())
# How many effects and studies
dat6 %>%
group_by(study) %>%
summarise(n_gi = n()) %>%
summarise(
`No. of Studies` = n(),
`No. of Effects` = sum(n_gi)
) %>% ungroup()
# What is the distribution of effects
ggplot(dat6) + aes(g2) + geom_density()
# Quite skewed to the right, looks like we have some large effects
# even though we have only 57 effects from 27 studies
# What are the two largest effects?
two_largest <- tail(sort(dat6$g2),2)
# [1] 6.89757 10.45655 insanely large, many times larger than
# than mean(dat6$g2) which is 0.9!
# These two large effects also exceed 3*SD from the mean (Lipsey & Wilson, 2001)
two_largest > with(dat6, c(`3SDfromMean`= mean(g2)+3*sd(g2)))
# [1] TRUE TRUE
# Let's inspect the impact of these two extreme effects on a
# basic 3-level model
# Reintroduce naturally occurring dependence before removing 2 largest effects
Vs <- with(dat6, impute_covariance_matrix(v_g2, study, r=.5,
subgroup = sample_id))
# 3-level Additive symmetry model
m1 = rma.mv(g2 ~ time + study_length+no_treat+true_experiment, Vs,
random = ~1|study/effect, data = dat6,
dfs = "contain")
# Removing 2 largest effect sizes to measure their impact on m1
dat7 <- filter(dat6, !g2 %in% two_largest)
# Reintroduce naturally occurring dependence AFTER removing 2 largest effects
Vs_af <- with(dat7, impute_covariance_matrix(v_g2, study, r=.5,
subgroup = sample_id))
# m1 model before removing 2 largest effects
m_before <- m1
# m1 model AFTER removing 2 largest effects
m_after <- update(m_before, data=dat7, V=Vs_af)
# Measuring the CIs width of pre- post effects for models BEFORE (_bf) & AFTER (_af)
(t_bf =type.convert( post_rma(m_before,~ time)$table, as.is=TRUE))
(t_af =type.convert( post_rma(m_after,~ time)$table, as.is=TRUE))
(t_bf_ci_widths = t_bf$Upper - t_bf$Lower)
(t_af_ci_widths = t_af$Upper - t_af$Lower)
# The %reduction in the width of CIs due to removing two outliers
paste0(round((t_bf_ci_widths - t_af_ci_widths)/t_bf_ci_widths*100),"%")
# [1] "52%" "46%" "57%"
# Vast improvement in precision (CIs narrower by up to 57%) due to removing two outliers!
# Continue to model selection without the two outlying effects using dat7
# Let's run 5 more models in addition to m_after and choose:
####################
# Model selection
####################
# 3-level Additive symmetry model
m1 <- m_after
# Estimation checks out, passes!
profile(m1)
# Homogeneous Auto-regressive model
m2 = rma.mv(g2 ~ time + study_length+no_treat+true_experiment, Vs_af,
random = list(~time|study, ~1|effect), struct = "AR",
data = dat7,
dfs = "contain")
# Estimation checks out, passes!
profile(m2)
# Heterogeneous auto-regressive model
m3 = rma.mv(g2 ~ time + study_length+no_treat+true_experiment, Vs_af,
random = list(~time|study, ~1|effect), struct="HAR",
data = dat7,
dfs = "contain")
# Estimation doesn't check out, exclude this model!
profile(m3)
# Heterogeneous compound symmetry model
m4 = rma.mv(g2 ~ time + study_length+no_treat+true_experiment, Vs_af,
random = list(~time|study, ~1|effect), struct = "HCS",
data = dat7,
dfs = "contain")
# Estimation doesn't check out, exclude this model!
profile(m4)
# Homogeneous compound symmety model
m5 = rma.mv(g2 ~ time + study_length+no_treat+true_experiment, Vs_af,
random = list(~time|study, ~1|effect), struct = "CS",
data = dat7,
dfs = "contain")
# Estimation doesn't check out, exclude this model!
profile(m5)
# 4-level Additive symmetry model
m6 = rma.mv(g2 ~ time + study_length+no_treat+true_experiment, Vs_af,
random = ~1|study/time/effect, data = dat7,
dfs = "contain")
# Estimation checks out, passes!
profile(m6)
# Run a weighted comparison between the above 'checked out' models
AICctab(m1, m2, m6, weights=TRUE, base=TRUE)
# m1 wins!! We'll use a 3-level additive symmetry model.
# Q: Is this overall longitudinal model sensitive to the amount of naturally occurring dependence?
# Time effects:
p2 <- post_rma(m1, ~time)
# Sensitivity analysis:
sense_rma(p2, var_name = "v_g2")
# A: Not really except in the case of posttest2 effects which are excluded from interpretation due to their extremely limited number (see next part).
# How many studies and effects for each meta-analytic model do we have?
moderators <- c(
"time",
"treat_grp",
"outcome",
"measure",
"input_mode",
"material_type",
"note_option",
"prof",
"script_distance",
"region")
# A list of time and moderators interacting with time
LIST <- c("time", map(moderators[-1], c, "time"))
# Count # of studies and effects at each time
setNames(map(LIST, ~effect_count(dat7, study, !!!syms(.), show0=FALSE, arrange_by="time", na.rm=TRUE)), moderators)
# post-test2 effects (m=4) are from 3 studies! Exclude from interpretations.
########################################
# Model fitting after initial steps
########################################
# Fit all moderator models using a function
fit_model <- function(pred="none",
V = Vs_af, data = dat7,
method_vars = c("study_length","no_treat",
"true_experiment")){
overall <- pred=="none"
time_case <- if(pred!="time")"* time" else " "
form <- as.formula(paste("g2 ~", if(overall) "" else paste(paste(pred, time_case),"+"),
paste(setdiff(method_vars,pred),
collapse = "+")))
m <- rma.mv(form, V = V,
random = ~1|study/effect, data = data,
dfs = "contain")
m0 <- update.rma(m, yi = g2 ~ 1)
form_post_rma <- if(overall) ~1 else as.formula(paste("~",pred, time_case))
ems <- post_rma(m, form_post_rma)
form_plot <- if(overall) ~1 else as.formula(paste(if(pred=="time") "~" else paste(pred,"~"), "time"))
legend_t <- if(!overall){
if(pred=="time")"Time" else
if(pred=="measure")"Measure Type" else
if(pred=="test_type")"Test Type" else
if(pred=="prof") "Proficiency" else
if(pred=="study_setting") "Study Setting" else
if(pred=="lang_context") "Language Context" else
if(pred=="treat_grp") "Note-Taking Type" else
if(pred=="region") "Region" else
if(pred=="input_mode") "Input Mode" else
if(pred=="note_option") "Note-Taking Option" else
if(pred=="note_instruct") "Note-Taking Instruction" else
if(pred=="script_distance") "L1-L2 Orthographic Distance" else
if(pred=="age_group") "Age Group" else
if(pred=="material_type") "Material Type" else
str_to_title(pred)
} else
{ "Overall Effect" }
plot <- plot_rma(m, form_plot, xlab = if(!overall) "Time" else NULL, ylab="Effect Size (Hedges' g)", dodge=.25) +
labs(color = legend_t) + theme_test() +
scale_color_manual(values = c("black","red", "blue", "green3", "purple",
"orange3", "pink3", "red4"))
R2 <- R2_rma(m, null_model = m0, model_names = legend_t)
list(model = m, ems = ems, plot = plot, R2 = R2)
}
# Fit all moderator models:
out <- setNames(map(moderators, fit_model), moderators)
# Save them and share them with readers
saveRDS(out, "np.rds")
Variable | Definition |
---|---|
study | Author(s) |
year | Year of publication |
gray | Gray literature binary identifier (i.e., research produced by less visible or well-known publishers or by organizations outside of the traditional commercial or academic publishing and distribution channels) |
assign_type | Assignment of groups by class or by student |
n_class | Total number of classes assigned to all groups |
Nt | Total number of students in all classes assigned to treatments |
Nc | Total number of students in all classes assigned to control |
sample_id | Numerical index of the independent samples of participants |
g | Effect size estimates before assign_type adjustment |
v_g | Sampling variances before assign_type adjustment |
g2 | Effect size estimates after assign_type adjustment |
v_g2 | Sampling variances after assign_type adjustment |
nT | Number of participants in each treatment group |
mT | Mean of each treatment group |
sdT | Standard deviation of each treatment group |
nC | Number of participants in the control group |
mC | Mean of the control group |
sdC | Standard deviation of the control group |
effect | Numerical index of each row |
SE_egger | Modified standard error of Hedges' g for Egger's test purposes based on Pustejovsky and Rogers (2019)* |
*For more details see: https://onlinelibrary.wiley.com/doi/10.1002/jrsm.1332 |
As noted above, this section uses the saved results of the previous
section (no need to actually run the R code in the previous section,
unless otherwise needed). Readers are encouraged to actually run the
following R codes. Once again,
click on
Code
on the bottom right for the reproducible codes used in
each section.
# We use the software package introduced by Norouzian & Bui (2024)
source("https://raw.githubusercontent.com/rnorouzian/i/master/3m.r")
library(knitr)
library(flextable)
library(kableExtra)
library(rmarkdown)
opts_chunk$set(message=FALSE, warning=FALSE, fig.align="center")
# data after outlier removal from previous initial analyses
dat7 <- read.csv("https://raw.githubusercontent.com/fpqq/w/main/dat_after_processing.csv")
g <- dat7 %>%
group_by(study) %>%
summarise(n_gi = n()) %>%
summarise(
`No. of Studies` = n(),
`No. of Effects` = sum(n_gi),
`Min. Effects in Study` = min(n_gi),
`Max. Effects in Study` = max(n_gi),
`Median Effects in Study` = median(n_gi)
) %>% ungroup()
flextable(g) %>%
autofit() %>% set_caption("Distribution Summary of Effect Sizes") %>% fontsize(size = 11, part = "all") %>%
line_spacing(space = .6, part = "all")
No. of Studies | No. of Effects | Min. Effects in Study | Max. Effects in Study | Median Effects in Study |
---|---|---|---|---|
27 | 55 | 1 | 6 | 2 |
Figure 3 displays the studies’ individual effect size estimates aggregated at the study level. The dotted triangle indicates the boundaries for statistical significance on either side of the null effect (i.e., no study-level effect in reality exists; 0). As can be seen, there are six effect size estimates aggregated at the study level that are statistically significant in magnitude and positive in direction. This value constitutes ~22% of the total number of study-level aggregate effect sizes in our meta-analysis. Furthermore, half of these study-level aggregate effect sizes are from the “Less Visible Literature” (Hopewell, Clarke, & Mallett, 2005) including the largest of them. Arguably, such evidence does not seem to indicate a tendency for the note-taking literature to intentionally favor studies that, as a whole, have found positive and statistically significant effects from note-taking. Thus, this form of publication bias at the study-level seems less likely.
###############################
# 3M publication bias detection
###############################
# Naturally existing dependence from previous section
Vs = with(dat7, impute_covariance_matrix(v_g2, study, r=.5,
subgroup = sample_id))
# Magnitude of within-study correlations
rho <- 0.5
# Aggregate effects at Study level (level 3)
data_agg_study <-
dat7 %>%
escalc(data = ., yi = g2, vi = v_g2) %>%
aggregate.escalc(cluster = study, rho = rho, weighted = FALSE)
# Contour plot at study level
with(data_agg_study,
contour_funnel(x = g2,
vi = v_g2, sig = FALSE,
xlab = "Study-Level Effect Sizes",
col = ifelse(gray=="yes","red","blue"),
bg = ifelse(gray=="yes","red","blue")))
legend("topright", c("Less Visible","Mainstream"), title = "Literature", pch = 19,
col = c("red","blue"), title.font = 2, cex = .8)
box()
Figure 3. Contour-Enhanced Funnel Plot of Study-Level Effects
# This time get the tabular counts of study-level effects that are sig.
# g <- with(data_agg_study,
# contour_funnel(x = g2,
# vi = v_g2, sig = TRUE))
flextable(g) %>%
autofit() %>% set_caption("Statistically significant study level effects") %>% fontsize(size = 11, part = "all") %>%
line_spacing(space = .6, part = "all")
Total | Total(%) | Left | Left(%) | Right | Right(%) | Sig. |
---|---|---|---|---|---|---|
7 | 25.93 | 1 | 3.7 | 6 | 22.22 | 0.05 |
Figure 4 displays the studies’ individual effect size estimates. As before, the dotted triangle indicates the boundaries for statistical significance on either side of the null effect (i.e., no individual effect in reality exists; 0). As can be seen, there are seventeen effect size estimates that are statistically significant in magnitude and positive in direction. This value constitutes ~31% percent of the total number of effect sizes in our meta-analysis. On the other hand, there are two effect size estimates that are statistically significant in magnitude and negative in direction. This value constitutes ~3% percent of the total number of effect sizes in our meta-analysis. Furthermore, ~30% of these effect estimates are from the “Less Visible Literature” including the largest of them.
Arguably, the comparison at the effect size level could potentially suggest the possibility of an imbalance in the note-taking literature in favor of the positive and statistically significant effects. However, given the lack of such an imbalance at the study-level and presence of multiple positive and statistically significant effects in the less visible literature, the trend seen at the effect size level might indicate a somewhat natural process that is not, for the most part, impacted by the publication industry’s policies as to which studies should be published and which ones should not in the note-taking literature.
# Contour plot at effect size level
with(dat7,
contour_funnel(x = g2,
vi = v_g2, sig = FALSE,
col = ifelse(gray=="yes","red","blue"),
bg = ifelse(gray=="yes","red","blue")))
legend("topright", c("Less Visible","Mainstream"), title = "Literature", pch = 19,
col = c("red","blue"), title.font = 2, cex = .8)
box()
Figure 4. Contour-Enhanced Funnel Plot of Individual Effects
# This time get the tabular counts of individual effects that are sig.
# g <- with(dat7,
# contour_funnel(x = g2,
# vi = v_g2, sig = TRUE))
flextable(g) %>%
autofit() %>% set_caption("Statistically significant individual effects") %>% fontsize(size = 11, part = "all") %>%
line_spacing(space = .6, part = "all")
Total | Total(%) | Left | Left(%) | Right | Right(%) | Sig. |
---|---|---|---|---|---|---|
19 | 34.55 | 2 | 3.64 | 17 | 30.91 | 0.05 |
We also conducted an Egger’s test (Egger, Smith, Schneider, & Minder, 1997) of funnel plot symmetry. Using this test, we examined the extent to which the standard error (as a measure of precision) of the effect sizes collected from the note-taking literature related to the effect sizes’ magnitude. If such a relationship and/or its estimate of intercept, with the latter sometimes referred to as a precision-effect test (PET), rise to statistically significant levels, that could suggest asymmetry (and potentially publication bias) in the funnel plot of effect sizes.
In our case, given that the p-value for the Egger’s test for the relationship in question (b = 0.427, p = 0.784; 95% CI[-2.680, 3.533]) and its estimate of intercept (a = 0.390, p = 0.370; 95% CI[-0.491, 1.271]) are both larger than 0.05, we concluded that our funnel plot is sufficiently symmetric and the likelihood of publication bias in the collected sample of note-taking studies is small with the caveat that the b estimate has a relatively wide CI.
# Eggers test using the same naturally and statistically occurring dependence
ff = rma.mv(g2 ~ SE_egger, V = Vs,
random = ~1|study/effect, data = dat7,
dfs = "contain")
g <- results_rma(ff, drop_rows = 3:7, drop_cols = 9:10, tidy = TRUE)
flextable(dplyr::select(g, -Df)) %>%
autofit() %>% set_caption("Egger's Test Results") %>% fontsize(size = 11, part = "all") %>%
line_spacing(space = .6, part = "all")
Terms | Estimate | SE | t | p-value | Sig. | Lower | Upper |
---|---|---|---|---|---|---|---|
(Intercept) | 0.390 | 0.428 | 0.912 | 0.370 |
| -0.491 | 1.271 |
SE_egger | 0.427 | 1.549 | 0.275 | 0.784 |
| -2.680 | 3.533 |
In this section, we present the results of our analyses in two parts. In the first part, we present the synthesized effects at each time point. As mentioned in the manuscript, results based on a limited number of effects (M) and/or studies (K) should be ignored due to their unreliable nature.
Also presented in the first part is the \(R^2\) test of heterogeneity. \(R^2\) indicates the percentage of change in the total heterogeneity (between- and within the studies) in the true effects of note-taking from a model without any MUTOS moderator (a null model) to that from a model that includes a set of MUTOS moderators of interest.
While necessary, the results presented in the first part may not by themselves immediately translate into evidence-based recommendations. This is because the descriptive (synthesized average effects) and the associated inferential results (CIs and p-values) simply denote how much effect at each measurement occasion exists and if that effect is reliably different from 0 at that point in time.
In the second part, we compare the changes that occurred in learners’ performance from one measurement occasion (baseline) to another (post-test) to specifically measure the potential learning “gains” that might have resulted from note-taking treatments taking into account the methodological differences that differentiate the studies to varying degrees (see Table 2 in the manuscript for more details on moderators).
Because the second part allows us to measure the gains from note-taking across more than one occasion, the results (i.e., synthesized average effects and their associated CIs and p-values) more immediately translate into evidence-based recommendations. To further facilitate such recommendations, in the second part we also measure a minimum expected benefit of using note taking presented in the universal metric of percentages.
table_names <-
c("Time",
"Note-Taking Type",
"Outcome",
"Measure Type",
"Input Mode",
"Material Type",
"Optional Note-Taking",
"Proficiency",
"L1-L2 Orthographic Differences",
"Region")
# Fitted moderator models stored from the previous section
results <- setNames(readRDS(url("https://github.com/fpqq/w/raw/main/np.rds")), table_names)
moderators_abb_names <- c(
"time",
"treat_grp",
"outcome",
"measure",
"input_mode",
"material_type",
"note_option",
"prof",
"script_distance",
"region")
# A list of time and moderators interacting with time
LIST <- c("time", map(moderators_abb_names[-1], c, "time"))
# Count # of studies and effects at each time
effect_no <- setNames(map(LIST, ~effect_count(dat7, study, !!!syms(.), show0=FALSE, na.rm=TRUE, arrange_by="time")), table_names)
rs <- results
invisible(lapply(table_names, \(i){
cat(paste0("\n\n### ", i, "\n"))
g <- rs[[i]]$ems
g3 <- rs[[i]]$plot
g4 <- rs[[i]]$R2
if(i!="Overall") print(g3)
print(kable(dplyr::select(cbind(g$table, dplyr::select(effect_no[[i]], `n study`, `n effect`)), -Df) %>% rename(K=`n study`, M=`n effect`),format = "simple", table.attr = "style='width:40%;'",
caption = paste("3M results for",tolower(i),"categories")) %>%
kable_styling(bootstrap_options = "bordered",
full_width = TRUE, font_size = 9.5))
print(kable(g4 %>% rename(`Total Heterogeneity`=`Sigma(total)`, `Between-study Heterogeneity`=`Sigma(study)`, `Within-study Heterogeneity`=`Sigma(effect)`), format = "simple", table.attr = "style='width:40%;'",
caption = paste("R2 test of heterogeneity for",tolower(i))) %>%
add_footnote(c("Heterogeneity is in SD unit.",paste("The *p-value* indicates the statistical significance of the MUTOS moderators in the",i, "model *collectively*."))) %>%
kable_styling(bootstrap_options = "bordered",
full_width = TRUE, font_size = 9.5))
}))
time | Mean | SE | Lower | Upper | t | p-value | Sig. | K | M |
---|---|---|---|---|---|---|---|---|---|
baseline | -0.198 | 0.157 | -0.528 | 0.132 | -1.261 | 0.223 | 16 | 18 | |
posttest1 | 0.713 | 0.124 | 0.452 | 0.973 | 5.747 | 0.000 | *** | 25 | 33 |
posttest2 | 0.358 | 0.272 | -0.214 | 0.930 | 1.314 | 0.205 | 3 | 4 |
Model | Total Heterogeneity | Between-study Heterogeneity | Within-study Heterogeneity | p-value | R2 |
---|---|---|---|---|---|
No (M)UTOS | 0.661 | 0.157 | 0.642 | ||
Time | 0.438 | 0.253 | 0.358 | 0.001 | 33.658% |
Note: a Heterogeneity is in SD unit. b The p-value indicates the statistical significance of the MUTOS moderators in the Time model collectively.
treat_grp | time | Mean | SE | Lower | Upper | t | p-value | Sig. | K | M | |
---|---|---|---|---|---|---|---|---|---|---|---|
1 | conventional | baseline | 0.166 | 0.300 | -0.503 | 0.834 | 0.552 | 0.593 | 3 | 4 | |
2 | framework notes | baseline | -0.428 | 0.285 | -1.063 | 0.206 | -1.504 | 0.163 | 7 | 7 | |
3 | note-taking instruction | baseline | -0.133 | 0.300 | -0.802 | 0.536 | -0.444 | 0.667 | 5 | 5 | |
4 | vocabulary notebook | baseline | -0.664 | 0.540 | -1.867 | 0.538 | -1.231 | 0.246 | 1 | 2 | |
5 | conventional | posttest1 | 0.441 | 0.229 | -0.070 | 0.952 | 1.925 | 0.083 | . | 7 | 12 |
6 | framework notes | posttest1 | 0.868 | 0.246 | 0.320 | 1.417 | 3.527 | 0.005 | ** | 9 | 9 |
7 | note-taking instruction | posttest1 | 0.844 | 0.265 | 0.254 | 1.434 | 3.189 | 0.010 | ** | 7 | 7 |
8 | vocabulary notebook | posttest1 | 1.187 | 0.417 | 0.259 | 2.115 | 2.849 | 0.017 | * | 3 | 5 |
9 | conventional | posttest2 | 0.376 | 0.348 | -0.398 | 1.150 | 1.082 | 0.305 | 1 | 2 | |
11 | note-taking instruction | posttest2 | 0.438 | 0.550 | -0.787 | 1.663 | 0.797 | 0.444 | 1 | 1 | |
12 | vocabulary notebook | posttest2 | 0.455 | 0.726 | -1.162 | 2.072 | 0.627 | 0.545 | 1 | 1 |
Model | Total Heterogeneity | Between-study Heterogeneity | Within-study Heterogeneity | p-value | R2 |
---|---|---|---|---|---|
No (M)UTOS | 0.661 | 0.157 | 0.642 | ||
Note-Taking Type | 0.384 | 0.250 | 0.291 | 0.014 | 41.937% |
Note: a Heterogeneity is in SD unit. b The p-value indicates the statistical significance of the MUTOS moderators in the Note-Taking Type model collectively.
outcome | time | Mean | SE | Lower | Upper | t | p-value | Sig. | K | M | |
---|---|---|---|---|---|---|---|---|---|---|---|
1 | listening | baseline | -0.269 | 0.326 | -0.987 | 0.450 | -0.823 | 0.428 | 6 | 6 | |
2 | miscellaneous | baseline | -0.134 | 0.368 | -0.943 | 0.676 | -0.364 | 0.723 | 3 | 3 | |
3 | reading | baseline | 0.238 | 0.476 | -0.810 | 1.286 | 0.500 | 0.627 | 3 | 3 | |
4 | vocabulary | baseline | -0.181 | 0.259 | -0.751 | 0.388 | -0.701 | 0.498 | 5 | 6 | |
5 | listening | posttest1 | 0.453 | 0.206 | 0.000 | 0.905 | 2.202 | 0.050 | * | 9 | 14 |
6 | miscellaneous | posttest1 | 0.845 | 0.341 | 0.096 | 1.595 | 2.482 | 0.030 | * | 4 | 4 |
7 | reading | posttest1 | 1.241 | 0.395 | 0.372 | 2.109 | 3.144 | 0.009 | ** | 5 | 5 |
8 | vocabulary | posttest1 | 0.766 | 0.218 | 0.286 | 1.246 | 3.510 | 0.005 | ** | 8 | 10 |
10 | miscellaneous | posttest2 | 0.507 | 0.493 | -0.578 | 1.593 | 1.028 | 0.326 | 1 | 1 | |
12 | vocabulary | posttest2 | 0.355 | 0.348 | -0.412 | 1.122 | 1.018 | 0.331 | 3 | 3 |
Model | Total Heterogeneity | Between-study Heterogeneity | Within-study Heterogeneity | p-value | R2 |
---|---|---|---|---|---|
No (M)UTOS | 0.661 | 0.157 | 0.642 | ||
Outcome | 0.456 | 0.166 | 0.425 | 0.055 | 30.929% |
Note: a Heterogeneity is in SD unit. b The p-value indicates the statistical significance of the MUTOS moderators in the Outcome model collectively.
measure | time | Mean | SE | Lower | Upper | t | p-value | Sig. | K | M | |
---|---|---|---|---|---|---|---|---|---|---|---|
1 | miscellaneous | baseline | -0.012 | 0.815 | -1.772 | 1.748 | -0.015 | 0.988 | 3 | 3 | |
2 | recall | baseline | 0.008 | 0.271 | -0.577 | 0.594 | 0.031 | 0.976 | 5 | 5 | |
3 | recognition | baseline | -0.292 | 0.213 | -0.753 | 0.169 | -1.367 | 0.195 | 10 | 10 | |
4 | miscellaneous | posttest1 | 0.470 | 0.817 | -1.295 | 2.234 | 0.575 | 0.575 | 3 | 3 | |
5 | recall | posttest1 | 1.017 | 0.250 | 0.476 | 1.558 | 4.063 | 0.001 | ** | 7 | 8 |
6 | recognition | posttest1 | 0.598 | 0.148 | 0.278 | 0.919 | 4.034 | 0.001 | ** | 17 | 21 |
8 | recall | posttest2 | 0.440 | 0.500 | -0.641 | 1.521 | 0.880 | 0.395 | 1 | 1 | |
9 | recognition | posttest2 | 0.389 | 0.349 | -0.365 | 1.142 | 1.115 | 0.285 | 3 | 3 |
Model | Total Heterogeneity | Between-study Heterogeneity | Within-study Heterogeneity | p-value | R2 |
---|---|---|---|---|---|
No (M)UTOS | 0.661 | 0.157 | 0.642 | ||
Measure Type | 0.458 | 0.133 | 0.438 | 0.043 | 30.651% |
Note: a Heterogeneity is in SD unit. b The p-value indicates the statistical significance of the MUTOS moderators in the Measure Type model collectively.
input_mode | time | Mean | SE | Lower | Upper | t | p-value | Sig. | K | M |
---|---|---|---|---|---|---|---|---|---|---|
listening | baseline | -0.050 | 0.271 | -0.641 | 0.541 | -0.185 | 0.856 | 7 | 8 | |
miscellaneous | baseline | -0.490 | 0.354 | -1.260 | 0.281 | -1.384 | 0.192 | 5 | 6 | |
reading | baseline | 0.111 | 0.380 | -0.717 | 0.939 | 0.292 | 0.775 | 4 | 4 | |
listening | posttest1 | 0.555 | 0.214 | 0.089 | 1.022 | 2.592 | 0.024 | * | 10 | 16 |
miscellaneous | posttest1 | 0.809 | 0.329 | 0.093 | 1.525 | 2.462 | 0.030 | * | 7 | 9 |
reading | posttest1 | 0.947 | 0.281 | 0.335 | 1.558 | 3.371 | 0.006 | ** | 8 | 8 |
listening | posttest2 | 0.385 | 0.383 | -0.449 | 1.219 | 1.005 | 0.335 | 1 | 2 | |
miscellaneous | posttest2 | 0.229 | 0.758 | -1.424 | 1.881 | 0.302 | 0.768 | 1 | 1 | |
reading | posttest2 | 0.530 | 0.603 | -0.784 | 1.845 | 0.879 | 0.397 | 1 | 1 |
Model | Total Heterogeneity | Between-study Heterogeneity | Within-study Heterogeneity | p-value | R2 |
---|---|---|---|---|---|
No (M)UTOS | 0.661 | 0.157 | 0.642 | ||
Input Mode | 0.454 | 0.255 | 0.376 | 0.024 | 31.298% |
Note: a Heterogeneity is in SD unit. b The p-value indicates the statistical significance of the MUTOS moderators in the Input Mode model collectively.
material_type | time | Mean | SE | Lower | Upper | t | p-value | Sig. | K | M |
---|---|---|---|---|---|---|---|---|---|---|
academic | baseline | -0.462 | 0.189 | -0.865 | -0.058 | -2.440 | 0.028 | * | 13 | 14 |
non-academic | baseline | 0.402 | 0.291 | -0.217 | 1.022 | 1.384 | 0.187 | 3 | 4 | |
academic | posttest1 | 0.661 | 0.138 | 0.366 | 0.955 | 4.777 | 0.000 | *** | 19 | 25 |
non-academic | posttest1 | 0.896 | 0.230 | 0.405 | 1.387 | 3.892 | 0.001 | ** | 6 | 8 |
academic | posttest2 | 0.203 | 0.443 | -0.740 | 1.147 | 0.460 | 0.652 | 2 | 2 | |
non-academic | posttest2 | 0.692 | 0.351 | -0.057 | 1.441 | 1.970 | 0.068 | . | 1 | 2 |
Model | Total Heterogeneity | Between-study Heterogeneity | Within-study Heterogeneity | p-value | R2 |
---|---|---|---|---|---|
No (M)UTOS | 0.661 | 0.157 | 0.642 | ||
Material Type | 0.397 | 0.185 | 0.351 | 0.003 | 39.952% |
Note: a Heterogeneity is in SD unit. b The p-value indicates the statistical significance of the MUTOS moderators in the Material Type model collectively.
note_option | time | Mean | SE | Lower | Upper | t | p-value | Sig. | K | M |
---|---|---|---|---|---|---|---|---|---|---|
allowed | baseline | 0.274 | 0.264 | -0.291 | 0.840 | 1.040 | 0.316 | 5 | 6 | |
required | baseline | -0.515 | 0.207 | -0.959 | -0.071 | -2.488 | 0.026 | * | 10 | 11 |
allowed | posttest1 | 0.731 | 0.224 | 0.250 | 1.212 | 3.258 | 0.006 | ** | 8 | 11 |
required | posttest1 | 0.782 | 0.165 | 0.428 | 1.135 | 4.742 | 0.000 | *** | 16 | 20 |
allowed | posttest2 | 0.567 | 0.345 | -0.172 | 1.306 | 1.645 | 0.122 | 1 | 2 | |
required | posttest2 | 0.260 | 0.431 | -0.664 | 1.184 | 0.603 | 0.556 | 2 | 2 |
Model | Total Heterogeneity | Between-study Heterogeneity | Within-study Heterogeneity | p-value | R2 |
---|---|---|---|---|---|
No (M)UTOS | 0.661 | 0.157 | 0.642 | ||
Note-Taking Option | 0.409 | 0.273 | 0.305 | 0.002 | 38.023% |
Note: a Heterogeneity is in SD unit. b The p-value indicates the statistical significance of the MUTOS moderators in the Optional Note-Taking model collectively.
prof | time | Mean | SE | Lower | Upper | t | p-value | Sig. | K | M |
---|---|---|---|---|---|---|---|---|---|---|
beginner to lower intermediate | baseline | -0.719 | 0.306 | -1.386 | -0.052 | -2.350 | 0.037 | * | 3 | 4 |
high intermediate to advanced | baseline | 0.425 | 0.369 | -0.380 | 1.229 | 1.150 | 0.273 | 2 | 3 | |
intermediate | baseline | -0.134 | 0.221 | -0.615 | 0.346 | -0.609 | 0.554 | 11 | 11 | |
beginner to lower intermediate | posttest1 | 0.736 | 0.227 | 0.241 | 1.230 | 3.241 | 0.007 | ** | 7 | 9 |
high intermediate to advanced | posttest1 | 1.037 | 0.317 | 0.346 | 1.729 | 3.268 | 0.007 | ** | 4 | 5 |
intermediate | posttest1 | 0.624 | 0.178 | 0.236 | 1.012 | 3.508 | 0.004 | ** | 14 | 19 |
beginner to lower intermediate | posttest2 | 0.292 | 0.729 | -1.295 | 1.880 | 0.401 | 0.695 | 1 | 1 | |
high intermediate to advanced | posttest2 | 0.758 | 0.396 | -0.106 | 1.621 | 1.912 | 0.080 | . | 1 | 2 |
intermediate | posttest2 | 0.297 | 0.558 | -0.920 | 1.513 | 0.532 | 0.605 | 1 | 1 |
Model | Total Heterogeneity | Between-study Heterogeneity | Within-study Heterogeneity | p-value | R2 |
---|---|---|---|---|---|
No (M)UTOS | 0.661 | 0.157 | 0.642 | ||
Proficiency | 0.441 | 0.283 | 0.337 | 0.013 | 33.299% |
Note: a Heterogeneity is in SD unit. b The p-value indicates the statistical significance of the MUTOS moderators in the Proficiency model collectively.
script_distance | time | Mean | SE | Lower | Upper | t | p-value | Sig. | K | M |
---|---|---|---|---|---|---|---|---|---|---|
greater | baseline | -0.251 | 0.184 | -0.641 | 0.139 | -1.363 | 0.192 | 13 | 14 | |
shorter | baseline | -0.061 | 0.378 | -0.862 | 0.741 | -0.160 | 0.875 | 3 | 4 | |
greater | posttest1 | 0.568 | 0.151 | 0.249 | 0.888 | 3.769 | 0.002 | ** | 20 | 26 |
shorter | posttest1 | 1.250 | 0.299 | 0.616 | 1.884 | 4.177 | 0.001 | *** | 5 | 7 |
greater | posttest2 | 0.283 | 0.280 | -0.309 | 0.876 | 1.014 | 0.326 | 3 | 4 |
Model | Total Heterogeneity | Between-study Heterogeneity | Within-study Heterogeneity | p-value | R2 |
---|---|---|---|---|---|
No (M)UTOS | 0.661 | 0.157 | 0.642 | ||
L1-L2 Orthographic Distance | 0.473 | 0.314 | 0.354 | 0.002 | 28.416% |
Note: a Heterogeneity is in SD unit. b The p-value indicates the statistical significance of the MUTOS moderators in the L1-L2 Orthographic Differences model collectively.
region | time | Mean | SE | Lower | Upper | t | p-value | Sig. | K | M | |
---|---|---|---|---|---|---|---|---|---|---|---|
1 | East Asia | baseline | -0.001 | 0.224 | -0.483 | 0.480 | -0.006 | 0.995 | 6 | 7 | |
3 | Middle East | baseline | -0.192 | 0.211 | -0.646 | 0.261 | -0.910 | 0.378 | 10 | 11 | |
4 | East Asia | posttest1 | 0.786 | 0.226 | 0.300 | 1.272 | 3.470 | 0.004 | ** | 6 | 7 |
5 | Europe/North America | posttest1 | 0.248 | 0.268 | -0.327 | 0.822 | 0.924 | 0.371 | 4 | 8 | |
6 | Middle East | posttest1 | 0.885 | 0.171 | 0.518 | 1.252 | 5.173 | 0.000 | *** | 15 | 18 |
7 | East Asia | posttest2 | 0.465 | 0.343 | -0.271 | 1.201 | 1.355 | 0.197 | 1 | 2 | |
9 | Middle East | posttest2 | 0.371 | 0.455 | -0.605 | 1.348 | 0.815 | 0.429 | 2 | 2 |
Model | Total Heterogeneity | Between-study Heterogeneity | Within-study Heterogeneity | p-value | R2 |
---|---|---|---|---|---|
No (M)UTOS | 0.661 | 0.157 | 0.642 | ||
Region | 0.395 | 0.107 | 0.380 | 0.010 | 40.203% |
Note: a Heterogeneity is in SD unit. b The p-value indicates the statistical significance of the MUTOS moderators in the Region model collectively.
rs <- results
invisible(lapply(table_names, \(i){
cat(paste0("\n\n### ", i, "\n"))
g <- rs[[i]]$ems
# Effects
gains <- if(i=="Time") contrast_rma(g, list("Gain1(post-test 1 - baseline)" =c(2,-1))) else contrast_rma(g, brief = TRUE)
print(kable(dplyr::select(gains$table, -Df),format = "simple", table.attr = "style='width:40%;'",
caption = paste("Learning gains for", tolower(i))) %>%
kable_styling(bootstrap_options = "bordered",
full_width = TRUE, font_size = 9.5))
if(i!="Time") gain_dif <- contrast_rma(g, gain_dif = TRUE, brief = TRUE, gain_dif_type = "same")
if(i!="Time") print(kable(dplyr::select(gain_dif$table, -Df),format = "simple", table.attr = "style='width:40%;'",
caption = paste("Differences in learning gains for", tolower(i))) %>%
kable_styling(bootstrap_options = "bordered",
full_width = TRUE, font_size = 9.5))
# Percentages
gain_prob <- prob_rma(gains, gain=TRUE, target_effect=.2)
print(kable(gain_prob,format = "simple", table.attr = "style='width:40%;'",
caption = paste("Minimum learning gain percentage for", tolower(i))) %>%
kable_styling(bootstrap_options = "bordered",
full_width = TRUE, font_size = 9.5))
}))
Contrast | Estimate | SE | Lower | Upper | t | p-value | Sig. |
---|---|---|---|---|---|---|---|
Gain1(post-test 1 - baseline) | 0.911 | 0.167 | 0.561 | 1.261 | 5.465 | 0.000 | *** |
Term | Target_Effect | Probability | Min | Max |
---|---|---|---|---|
Gain1(post-test 1 - baseline) | 0.2 or larger | 79.97% | 63.39% | 94.98% |
Contrast | Estimate | SE | Lower | Upper | t | p-value | Sig. | |
---|---|---|---|---|---|---|---|---|
1 | Gain1(conventional) | 0.275 | 0.251 | -0.284 | 0.835 | 1.096 | 0.299 | |
2 | Gain2(conventional) | 0.210 | 0.315 | -0.492 | 0.913 | 0.667 | 0.520 | |
3 | Gain1(framework notes) | 1.297 | 0.278 | 0.677 | 1.917 | 4.659 | 0.001 | *** |
5 | Gain1(note-taking instruction) | 0.977 | 0.287 | 0.338 | 1.616 | 3.407 | 0.007 | ** |
6 | Gain2(note-taking instruction) | 0.571 | 0.541 | -0.634 | 1.776 | 1.056 | 0.316 | |
7 | Gain1(vocabulary notebook) | 1.852 | 0.468 | 0.809 | 2.894 | 3.958 | 0.003 | ** |
8 | Gain2(vocabulary notebook) | 1.119 | 0.832 | -0.733 | 2.972 | 1.346 | 0.208 |
Contrast | Estimate | SE | Lower | Upper | t | p-value | Sig. | |
---|---|---|---|---|---|---|---|---|
1 | Gain1(conventional) - Gain1(framework notes) | -1.021 | 0.373 | -1.852 | -0.190 | -2.738 | 0.021 | * |
2 | Gain1(conventional) - Gain1(note-taking instruction) | -0.702 | 0.382 | -1.553 | 0.149 | -1.838 | 0.096 | . |
3 | Gain1(conventional) - Gain1(vocabulary notebook) | -1.576 | 0.532 | -2.761 | -0.391 | -2.962 | 0.014 | * |
5 | Gain2(conventional) - Gain2(note-taking instruction) | -0.361 | 0.627 | -1.757 | 1.036 | -0.576 | 0.577 | |
6 | Gain2(conventional) - Gain2(vocabulary notebook) | -0.909 | 0.889 | -2.890 | 1.072 | -1.023 | 0.331 | |
7 | Gain1(framework notes) - Gain1(note-taking instruction) | 0.319 | 0.398 | -0.568 | 1.207 | 0.802 | 0.441 | |
8 | Gain1(framework notes) - Gain1(vocabulary notebook) | -0.555 | 0.539 | -1.757 | 0.647 | -1.029 | 0.328 | |
11 | Gain1(note-taking instruction) - Gain1(vocabulary notebook) | -0.874 | 0.546 | -2.090 | 0.341 | -1.603 | 0.140 | |
12 | Gain2(note-taking instruction) - Gain2(vocabulary notebook) | -0.548 | 1.000 | -2.777 | 1.680 | -0.548 | 0.596 |
Term | Target_Effect | Probability | Min | Max |
---|---|---|---|---|
Gain1(conventional) | 0.2 or larger | 53.92% | 31.80% | 91.65% |
Gain2(conventional) | 0.2 or larger | 50.52% | 24.93% | 93.96% |
Gain1(framework notes) | 0.2 or larger | 92.48% | 67.96% | 99.99% |
Gain1(note-taking instruction) | 0.2 or larger | 84.58% | 55.37% | 99.90% |
Gain2(note-taking instruction) | 0.2 or larger | 68.66% | 20.74% | 99.97% |
Gain1(vocabulary notebook) | 0.2 or larger | 98.48% | 72.43% | 100.00% |
Gain2(vocabulary notebook) | 0.2 or larger | 88.58% | 18.08% | 100.00% |
Contrast | Estimate | SE | Lower | Upper | t | p-value | Sig. | |
---|---|---|---|---|---|---|---|---|
1 | Gain1(listening) | 0.721 | 0.354 | -0.058 | 1.501 | 2.036 | 0.067 | . |
3 | Gain1(miscellaneous) | 0.979 | 0.444 | 0.001 | 1.957 | 2.204 | 0.050 | * |
4 | Gain2(miscellaneous) | 0.641 | 0.585 | -0.647 | 1.929 | 1.095 | 0.297 | |
5 | Gain1(reading) | 1.002 | 0.481 | -0.056 | 2.061 | 2.085 | 0.061 | . |
7 | Gain1(vocabulary) | 0.947 | 0.281 | 0.328 | 1.567 | 3.366 | 0.006 | ** |
8 | Gain2(vocabulary) | 0.536 | 0.399 | -0.342 | 1.414 | 1.343 | 0.206 |
Contrast | Estimate | SE | Lower | Upper | t | p-value | Sig. | |
---|---|---|---|---|---|---|---|---|
1 | Gain1(listening) - Gain1(miscellaneous) | -0.258 | 0.564 | -1.500 | 0.983 | -0.458 | 0.656 | |
2 | Gain1(listening) - Gain1(reading) | -0.281 | 0.601 | -1.604 | 1.041 | -0.468 | 0.649 | |
3 | Gain1(listening) - Gain1(vocabulary) | -0.226 | 0.450 | -1.216 | 0.763 | -0.503 | 0.625 | |
7 | Gain1(miscellaneous) - Gain1(reading) | -0.023 | 0.656 | -1.467 | 1.420 | -0.035 | 0.972 | |
8 | Gain1(miscellaneous) - Gain1(vocabulary) | 0.032 | 0.524 | -1.122 | 1.186 | 0.061 | 0.953 | |
10 | Gain2(miscellaneous) - Gain2(vocabulary) | 0.105 | 0.698 | -1.433 | 1.642 | 0.150 | 0.883 | |
11 | Gain1(reading) - Gain1(vocabulary) | 0.055 | 0.558 | -1.173 | 1.283 | 0.099 | 0.923 |
Term | Target_Effect | Probability | Min | Max |
---|---|---|---|---|
Gain1(listening) | 0.2 or larger | 71.40% | 41.08% | 96.60% |
Gain1(miscellaneous) | 0.2 or larger | 80.09% | 43.09% | 99.31% |
Gain2(miscellaneous) | 0.2 or larger | 68.38% | 22.94% | 99.23% |
Gain1(reading) | 0.2 or larger | 80.78% | 41.14% | 99.55% |
Gain1(vocabulary) | 0.2 or larger | 79.11% | 54.46% | 97.24% |
Gain2(vocabulary) | 0.2 or larger | 64.22% | 31.78% | 95.57% |
Contrast | Estimate | SE | Lower | Upper | t | p-value | Sig. | |
---|---|---|---|---|---|---|---|---|
1 | Gain1(miscellaneous) | 0.482 | 0.894 | -1.450 | 2.414 | 0.539 | 0.599 | |
3 | Gain1(recall) | 1.009 | 0.335 | 0.286 | 1.732 | 3.014 | 0.010 | ** |
4 | Gain2(recall) | 0.432 | 0.553 | -0.762 | 1.626 | 0.781 | 0.449 | |
5 | Gain1(recognition) | 0.890 | 0.240 | 0.372 | 1.408 | 3.712 | 0.003 | ** |
6 | Gain2(recognition) | 0.681 | 0.390 | -0.161 | 1.522 | 1.747 | 0.104 |
Contrast | Estimate | SE | Lower | Upper | t | p-value | Sig. | |
---|---|---|---|---|---|---|---|---|
1 | Gain1(miscellaneous) - Gain1(recall) | -0.527 | 0.955 | -2.590 | 1.536 | -0.552 | 0.590 | |
2 | Gain1(miscellaneous) - Gain1(recognition) | -0.408 | 0.926 | -2.409 | 1.592 | -0.441 | 0.666 | |
5 | Gain1(recall) - Gain1(recognition) | 0.119 | 0.406 | -0.759 | 0.996 | 0.292 | 0.775 | |
6 | Gain2(recall) - Gain2(recognition) | -0.249 | 0.666 | -1.687 | 1.190 | -0.374 | 0.715 |
Term | Target_Effect | Probability | Min | Max |
---|---|---|---|---|
Gain1(miscellaneous) | 0.2 or larger | 61.83% | 7.54% | 99.87% |
Gain1(recall) | 0.2 or larger | 80.62% | 52.98% | 98.14% |
Gain2(recall) | 0.2 or larger | 59.78% | 20.11% | 97.38% |
Gain1(recognition) | 0.2 or larger | 76.94% | 55.95% | 94.98% |
Gain2(recognition) | 0.2 or larger | 69.63% | 37.66% | 96.39% |
Contrast | Estimate | SE | Lower | Upper | t | p-value | Sig. |
---|---|---|---|---|---|---|---|
Gain1(listening) | 0.606 | 0.260 | 0.039 | 1.173 | 2.327 | 0.038 | * |
Gain2(listening) | 0.435 | 0.371 | -0.374 | 1.244 | 1.172 | 0.264 | |
Gain1(miscellaneous) | 1.298 | 0.291 | 0.665 | 1.932 | 4.467 | 0.001 | *** |
Gain2(miscellaneous) | 0.718 | 0.792 | -1.008 | 2.444 | 0.907 | 0.382 | |
Gain1(reading) | 0.836 | 0.366 | 0.039 | 1.632 | 2.285 | 0.041 | * |
Gain2(reading) | 0.419 | 0.623 | -0.939 | 1.777 | 0.673 | 0.514 |
Contrast | Estimate | SE | Lower | Upper | t | p-value | Sig. |
---|---|---|---|---|---|---|---|
Gain1(listening) - Gain1(miscellaneous) | -0.693 | 0.388 | -1.539 | 0.154 | -1.783 | 0.100 | . |
Gain1(listening) - Gain1(reading) | -0.230 | 0.447 | -1.204 | 0.744 | -0.514 | 0.617 | |
Gain2(listening) - Gain2(miscellaneous) | -0.283 | 0.870 | -2.179 | 1.613 | -0.325 | 0.750 | |
Gain2(listening) - Gain2(reading) | 0.016 | 0.727 | -1.568 | 1.600 | 0.022 | 0.983 | |
Gain1(miscellaneous) - Gain1(reading) | 0.463 | 0.467 | -0.554 | 1.479 | 0.992 | 0.341 | |
Gain2(miscellaneous) - Gain2(reading) | 0.299 | 1.017 | -1.916 | 2.514 | 0.294 | 0.774 |
Term | Target_Effect | Probability | Min | Max |
---|---|---|---|---|
Gain1(listening) | 0.2 or larger | 68.02% | 44.19% | 93.73% |
Gain2(listening) | 0.2 or larger | 60.68% | 30.12% | 94.99% |
Gain1(miscellaneous) | 0.2 or larger | 89.74% | 66.35% | 99.68% |
Gain2(miscellaneous) | 0.2 or larger | 72.49% | 13.64% | 99.98% |
Gain1(reading) | 0.2 or larger | 76.84% | 44.19% | 98.79% |
Gain2(reading) | 0.2 or larger | 59.97% | 15.06% | 99.35% |
Contrast | Estimate | SE | Lower | Upper | t | p-value | Sig. |
---|---|---|---|---|---|---|---|
Gain1(academic) | 1.122 | 0.201 | 0.694 | 1.550 | 5.591 | 0.000 | *** |
Gain2(academic) | 0.665 | 0.459 | -0.312 | 1.642 | 1.450 | 0.168 | |
Gain1(non-academic) | 0.494 | 0.287 | -0.117 | 1.104 | 1.722 | 0.106 | |
Gain2(non-academic) | 0.289 | 0.363 | -0.484 | 1.063 | 0.798 | 0.437 |
Contrast | Estimate | SE | Lower | Upper | t | p-value | Sig. |
---|---|---|---|---|---|---|---|
Gain1(academic) - Gain1(non-academic) | 0.629 | 0.352 | -0.121 | 1.378 | 1.788 | 0.094 | . |
Gain2(academic) - Gain2(non-academic) | 0.376 | 0.586 | -0.872 | 1.624 | 0.641 | 0.531 |
Term | Target_Effect | Probability | Min | Max |
---|---|---|---|---|
Gain1(academic) | 0.2 or larger | 86.45% | 67.87% | 98.87% |
Gain2(academic) | 0.2 or larger | 71.06% | 31.52% | 99.26% |
Gain1(non-academic) | 0.2 or larger | 63.72% | 38.29% | 93.66% |
Gain2(non-academic) | 0.2 or larger | 54.23% | 26.02% | 92.75% |
Contrast | Estimate | SE | Lower | Upper | t | p-value | Sig. |
---|---|---|---|---|---|---|---|
Gain1(allowed) | 0.457 | 0.235 | -0.047 | 0.960 | 1.946 | 0.072 | . |
Gain2(allowed) | 0.293 | 0.319 | -0.391 | 0.977 | 0.918 | 0.374 | |
Gain1(required) | 1.297 | 0.203 | 0.861 | 1.733 | 6.385 | 0.000 | *** |
Gain2(required) | 0.775 | 0.442 | -0.173 | 1.722 | 1.754 | 0.101 |
Contrast | Estimate | SE | Lower | Upper | t | p-value | Sig. |
---|---|---|---|---|---|---|---|
Gain1(allowed) - Gain1(required) | -0.840 | 0.308 | -1.501 | -0.179 | -2.726 | 0.016 | * |
Gain2(allowed) - Gain2(required) | -0.482 | 0.544 | -1.650 | 0.685 | -0.886 | 0.391 |
Term | Target_Effect | Probability | Min | Max |
---|---|---|---|---|
Gain1(allowed) | 0.2 or larger | 62.89% | 40.40% | 92.68% |
Gain2(allowed) | 0.2 or larger | 54.74% | 28.04% | 93.12% |
Gain1(required) | 0.2 or larger | 91.98% | 74.23% | 99.83% |
Gain2(required) | 0.2 or larger | 76.91% | 35.68% | 99.82% |
Contrast | Estimate | SE | Lower | Upper | t | p-value | Sig. |
---|---|---|---|---|---|---|---|
Gain1(beginner to lower intermediate) | 1.455 | 0.294 | 0.814 | 2.096 | 4.944 | 0.000 | *** |
Gain2(beginner to lower intermediate) | 1.011 | 0.777 | -0.681 | 2.704 | 1.302 | 0.217 | |
Gain1(high intermediate to advanced) | 0.613 | 0.326 | -0.098 | 1.324 | 1.877 | 0.085 | . |
Gain2(high intermediate to advanced) | 0.333 | 0.363 | -0.458 | 1.125 | 0.917 | 0.377 | |
Gain1(intermediate) | 0.758 | 0.241 | 0.233 | 1.284 | 3.143 | 0.008 | ** |
Gain2(intermediate) | 0.431 | 0.567 | -0.804 | 1.666 | 0.760 | 0.462 |
Contrast | Estimate | SE | Lower | Upper | t | p-value | Sig. |
---|---|---|---|---|---|---|---|
Gain1(beginner to lower intermediate) - Gain1(high intermediate to advanced) | 0.842 | 0.440 | -0.117 | 1.802 | 1.912 | 0.080 | . |
Gain1(beginner to lower intermediate) - Gain1(intermediate) | 0.697 | 0.376 | -0.124 | 1.517 | 1.851 | 0.089 | . |
Gain2(beginner to lower intermediate) - Gain2(high intermediate to advanced) | 0.678 | 0.856 | -1.187 | 2.543 | 0.792 | 0.443 | |
Gain2(beginner to lower intermediate) - Gain2(intermediate) | 0.580 | 0.969 | -1.530 | 2.691 | 0.599 | 0.560 | |
Gain1(high intermediate to advanced) - Gain1(intermediate) | -0.146 | 0.406 | -1.030 | 0.739 | -0.359 | 0.726 | |
Gain2(high intermediate to advanced) - Gain2(intermediate) | -0.098 | 0.674 | -1.567 | 1.371 | -0.145 | 0.887 |
Term | Target_Effect | Probability | Min | Max |
---|---|---|---|---|
Gain1(beginner to lower intermediate) | 0.2 or larger | 93.67% | 71.86% | 99.95% |
Gain2(beginner to lower intermediate) | 0.2 or larger | 83.83% | 20.32% | 100.00% |
Gain1(high intermediate to advanced) | 0.2 or larger | 69.24% | 38.94% | 97.41% |
Gain2(high intermediate to advanced) | 0.2 or larger | 56.43% | 26.76% | 94.53% |
Gain1(intermediate) | 0.2 or larger | 75.15% | 51.24% | 96.97% |
Gain2(intermediate) | 0.2 or larger | 61.07% | 17.21% | 99.44% |
Contrast | Estimate | SE | Lower | Upper | t | p-value | Sig. |
---|---|---|---|---|---|---|---|
Gain1(greater) | 0.819 | 0.187 | 0.423 | 1.216 | 4.379 | 0.000 | *** |
Gain2(greater) | 0.534 | 0.287 | -0.075 | 1.143 | 1.860 | 0.081 | . |
Gain1(shorter) | 1.311 | 0.365 | 0.537 | 2.084 | 3.592 | 0.002 | ** |
Contrast | Estimate | SE | Lower | Upper | t | p-value | Sig. |
---|---|---|---|---|---|---|---|
Gain1(greater) - Gain1(shorter) | -0.492 | 0.408 | -1.357 | 0.373 | -1.205 | 0.246 |
Term | Target_Effect | Probability | Min | Max |
---|---|---|---|---|
Gain1(greater) | 0.2 or larger | 76.91% | 58.35% | 94.52% |
Gain2(greater) | 0.2 or larger | 65.43% | 39.74% | 93.12% |
Gain1(shorter) | 0.2 or larger | 90.67% | 62.50% | 99.85% |
Contrast | Estimate | SE | Lower | Upper | t | p-value | Sig. | |
---|---|---|---|---|---|---|---|---|
1 | Gain1(East Asia) | 0.787 | 0.272 | 0.204 | 1.370 | 2.895 | 0.012 | * |
2 | Gain2(East Asia) | 0.466 | 0.374 | -0.337 | 1.270 | 1.246 | 0.233 | |
5 | Gain1(Middle East) | 1.077 | 0.230 | 0.584 | 1.571 | 4.683 | 0.000 | *** |
6 | Gain2(Middle East) | 0.563 | 0.477 | -0.461 | 1.588 | 1.180 | 0.258 |
Contrast | Estimate | SE | Lower | Upper | t | p-value | Sig. | |
---|---|---|---|---|---|---|---|---|
2 | Gain1(East Asia) - Gain1(Middle East) | -0.290 | 0.356 | -1.054 | 0.474 | -0.815 | 0.429 | |
4 | Gain2(East Asia) - Gain2(Middle East) | -0.097 | 0.609 | -1.403 | 1.209 | -0.159 | 0.876 |
Term | Target_Effect | Probability | Min | Max |
---|---|---|---|---|
Gain1(East Asia) | 0.2 or larger | 74.96% | 50.15% | 96.08% |
Gain2(East Asia) | 0.2 or larger | 61.98% | 31.03% | 94.62% |
Gain1(Middle East) | 0.2 or larger | 84.27% | 63.83% | 98.04% |
Gain2(Middle East) | 0.2 or larger | 66.14% | 27.11% | 98.16% |
The following provides the studies (k = 27) that were included in the meta-analysis.
Reference |
---|
1- Ahmad, S. Z. (2019). Impact of Cornell notes vs. REAP on EFL secondary school students' critical reading skills. International Education Studies, 12(10), 60–74. |
2- Alahmadi, N. S. (2020). The effect of the mind mapping strategy on the L2 vocabulary learning of Saudi learners. Education and Linguistics Research, 6(1), 54–68. |
3- Al-Ghazo, A. (2023). The impact of note-taking strategy on EFL learners’ listening comprehension. Theory and Practice in Language Studies, 13(5), 1136–1147. |
4- Alzu'bi, M. A. (2019). The influence of suggested Cornell note-taking method on improving writing composition skills of Jordanian EFL learners. Journal of Language Teaching and Research, 10(4), 863–871. |
5- Amini, S., & Sadati, Z. (2023). The contribution of Cornell note-taking strategy instruction to the listening comprehension of Iranian EFL learners: A case of learners’ perception. Biannual Journal of Education Experiences, 6(1), 169–187. |
6- Aminifard, Y., & Aminifard, A. (2012). Note-taking and listening comprehension of conversations and mini-lectures: any benefit? Canadian Social Science, 8(4), 47–51. |
7- Bozorgian, H., & Pillay, H. (2013). Enhancing foreign language learning through listening strategy delivered in L1: An experimental study. International Journal of Instruction, 6(1). |
8- Chen, H. J. H., & Yang, T. Y. C. (2013). The impact of adventure video games on foreign language learning and the perceptions of learners. Interactive learning environments, 21(2), 129–141. |
9- Hale, G. A., & Courtney, R. (1994). The effects of note-taking on listening comprehension in the Test of English as a Foreign Language. Language Testing, 11(1), 29–47. |
10- Hayati, A. M., & Jalilifar, A. (2009). The impact of cultural knowledge on listening comprehension of EFL learners. English Language Teaching, 2(1), 101–111. |
11- Jin, Z., & Webb, S. (2021). Does writing words in notes contribute to vocabulary learning? Language Teaching Research. Advance online publication. |
12- Kang, E.Y. (2010). Effects of output and note-taking on noticing and interlanguage development. TESOL and Applied Linguistics, 10(2). 19–36. |
13- Kashani, S., & Shafiee, S. (2016). A comparison of vocabulary learning strategies among elementary Iranian EFL learners. Journal of Language Teaching and Research, 7(3), 511–518. |
14- Kilickaya, F., & Cokal-Karadas, D. (2009). The effect of note-taking on university students’ listening comprehension of lectures. Kastamonu Education Journal, 17(1), 47–56. |
15- Mežek, Š. (2013). Learning terminology from reading texts in English: The effects of note-taking strategies. Nordic Journal of English Studies, 12(1), 133–161. |
16- Moradi, S., Ghahari, S., & Abbas Nejad, M. (2020). Learner-vs. expert-constructed outlines: Testing the associations with L2 text comprehension and multiple intelligences. *Studies in Second Language Learning and Teaching, 10(2)*, 359–384. |
17- Nagep, D. S. M. (2022). The impact of five note-taking techniques on academic listening comprehension of EFL student teachers. مجلة القراءة والمعرفة, 22(253), 1–57. |
18- Najar, R.L. (1997). The effects of note taking strategy instruction on comprehension in ESL texts, (Unpublished doctoral dissertation). Hawaii University. |
19- Ngoc, N. T. K. (2023). The effect of using mind mapping technique on non-English major students’ grammar achievement at Dong Nai Technology University. Journal of English Language Teaching and Applied Linguistics, 5(2), 128–134. |
20- Piri, A., & Shirkhani, S. (2021). The comparative effects of note-taking and semantic mapping on EFL learners’ vocabulary learning and vocabulary retention. Journal of new advances in English Language Teaching and Applied Linguistics, 3(1), 506–517 |
21- Siregar, I. R. (2022). The effect of note-taking strategy on listening mastery at the Grade XI Students of SMA Islam Terpadu Darul Hasan Padangsidimpuan (unpublished doctoral dissertation). State Institute for Islamic Studies Padangsidimpuan. |
22- Ulfani, S. T., Sumardiyani, L., & Affini, L. N. (2023). The Implementation of Cornell Note-Taking to Improving Students' Reading Comprehension. Jurnal Pendidikan dan Sastra Inggris, 3(3), 50–59. |
23- Uysal, N. M.; Tezel, K.V. (2020). The effects of language learning strategies instruction based on learning styles on reading comprehension. RumeliDE Dil ve Edebiyat Araştırmaları Dergisi, (21), 697–714. |
24- Walters, J., & Bozkurt, N. (2009). The effect of keeping vocabulary notebooks on vocabulary acquisition. Language Teaching Research, 13(4), 403–423. |
25- Wilberschied, L. (1998). The relationships among a variation of immediate recall tasks and measures of L1 writing, L2 achievement, and cognitive strategy use in students of high school Spanish (Unpublished doctoral dissertation). The Ohio State University. |
26- Zarei, A. A., & Adami, S. (2013). The effects of semantic mapping, thematic clustering, and notebook keeping on L2 vocabulary recognition and production. Journal on English Language Teaching, 3(2), 17–27. |
27- Zohrabi, M., & Esfandyari, F. (2014). The impact of note taking on the improvement of listening comprehension of Iranian EFL learners. International Journal of English Language and Literature Studies, 3(2), 165–175. |