Statistical Analysis: Modelling the Effect of Communicative Attempt (H1) and Answer Similarity (H2) on Effort

Overview

In this script, we will be modelling the causal relation between effort and correction to confirm/reject our hypothesis.

These are:

H1: In corrections, people enhance some effort-related kinematic and/or acoustic features of their behaviour relative to the baseline.

H2: The enhancement depends on similarity of the guesser’s answer and the original meaning. More similar answer will require/result in smaller enhancement (but still enhancement) than less similar answer.

To assess the design and performance of the model, we will use synthetic data that we create to have certain interdependencies and where the effects have pre-defined values. We use this approach instead of using our dyad 0 data, because the pilot data do not include enough data to have a sensible testing sample.

The confirmatory analysis concerns six outcome variables, namely integral of torque change, integral of amplitude envelope, integral of change in center of pressure, and mean peak value of torque change, mean peak value of amplitude envelope, and mean peak value of change in center of pressure.

For our exploratory analysis, we will model variables that will result as most predictive in relation to communicative attempt, assessed by XGBoost modelling. This is why in our synthetic data, we create a variable Effort that is continuous, but oterwise free of any other assumptions.

Code to prepare the environment
library(here)
library(dplyr) # for data-wrangling
library(tidyr)  # for reshaping data (if needed)
library(ggplot2)
library(tibble)
library(rcartocolor)
library(patchwork)

library(ggdag) # for dag
library(dagitty)

library(bayesplot) # bayesian stuff
library(brms)
library(beepr)
library(bayestestR)
library(tidyverse)

library(splines) # for bsplines
library(emmeans) # for lognormal transformations


# current folder (first go to session -> set working directory -> to source file location)
curfolder <- getwd()
parentfolder <- dirname(curfolder)

featurefolder <- paste0(parentfolder, '/07_TS_featureExtraction/Datasets/')
demofolder    <- paste0(parentfolder, '/00_RAWDATA/Demographics_all/')
datasets      <- paste0(curfolder, '/datasets/')
models        <- paste0(curfolder, '/models/')
plots         <- paste0(curfolder, '/plots/')

Loading in our data

This is the feature dataframe we will be using when having access to full data. Additionally, we will merge the feature dataframe with demographic data collected via Qualtrics.

data_feat <- read_csv(paste0(featurefolder, "features_df_final.csv"))
data_demo <- read_csv(paste0(demofolder, "all_demodata.csv"))

For confirmatory analysis, we will need only 6 features - arm torque integral, arm torque peak mean, COPc integral, COPc peak mean, envelope integral and envelope peak mean from the data_feat. From data_demo, we will need only BFI and Familiarity

featstokeep <- c('COPc_pospeak_mean', 'envelope_pospeak_mean', 'arm_moment_sum_change_pospeak_mean', 'COPc_integral', 'envelope_integral', 'arm_moment_sum_change_integral', 'correction_info', 'concept', 'TrialID', 'modality', 'expressibility', 'answer_prev', 'answer_prev_dist')

demotokeep <- c('BFI_extra', 'BFI_agree', 'BFI_consc', 'BFI_negemo', 'BFI_open', 'Familiarity', 'pcn_ID')

# Subset the dataframe
data_feat <- data_feat %>% select(all_of(featstokeep))
data_demo <- data_demo %>% select(all_of(demotokeep))

To merge them, we will need to have column pcn_ID in data_feat too.

# Extract participant part from TrialID
data_feat <- data_feat %>%
  mutate(
    pcn_ID = str_extract(TrialID, "^[0-9]+") %>%  # Extract first number (before first underscore)
      paste0("_", str_extract(TrialID, "p[0-9]+") %>% str_remove("p")) # Extract last part and format as pcn_ID
  )

final_data <- left_join(data_feat, data_demo, by = "pcn_ID")

head(final_data, n=15)
# A tibble: 15 × 20
   COPc_pospeak_mean envelope_pospeak_mean arm_moment_sum_change…¹ COPc_integral
               <dbl>                 <dbl>                   <dbl>         <dbl>
 1            -0.257                -0.529                  -0.29        -326.  
 2            -0.516                -0.531                  -0.545       -782.  
 3            -0.507                -0.529                   0.41        -791.  
 4            -0.052                -0.406                  -0.227          4.62
 5            -0.155                -0.419                  -0.156       -269.  
 6            -0.233                -0.513                  -0.108       -171.  
 7            -0.561                 0.338                  NA           -486.  
 8             3.50                 NA                       1.34       18163.  
 9            -0.444                -0.408                   1.00       -1092.  
10            -0.287                -0.452                  -0.047       -870.  
11             0.547                 0.137                  NA           1435.  
12            -0.441                -0.51                    0.886       -529.  
13             0.216                 0.474                   0.658       1623.  
14             0.643                 1.31                    0.61        1988.  
15            -0.361                -0.478                   1.56        -290.  
# ℹ abbreviated name: ¹​arm_moment_sum_change_pospeak_mean
# ℹ 16 more variables: envelope_integral <dbl>,
#   arm_moment_sum_change_integral <dbl>, correction_info <chr>, concept <chr>,
#   TrialID <chr>, modality <chr>, expressibility <dbl>, answer_prev <chr>,
#   answer_prev_dist <dbl>, pcn_ID <chr>, BFI_extra <dbl>, BFI_agree <dbl>,
#   BFI_consc <dbl>, BFI_negemo <dbl>, BFI_open <dbl>, Familiarity <dbl>


Because current pilot data are quite likely unsufficient in terms of power (65 observations), and would hinder testing the design of the statistical models, we will create synthetic data with assumed coefficients for all predictors we plan to add to the model.

H1: Stating causal model

Before simulating synthetic data, we will first formulate our causal model to explicitly identify common causes, and hence prevent (or rather minimize) confounding in our statistical model. Note that while we are interested in causal relationship between effort (outcome) and communicative attempt (predictor for H1) and answer similarity (predictor for H2), we need to acknowledge that there are other factors possibly contributing to the outcome variable, despite the experiment beling relatively controlled. These relate, for instance, to the participants (e.g., personality traits) or stimuli (i.e., concepts). Our causal model therefore includes more variables that are identified as potentially sharing causal path with our primary predictor - communicative attempt. They are not of primary interest, they ensure we can isolate the causal effect of the main predictor on the outcome variable - effort.

To explicitly formulate the causal paths, we will start our analysis visualizing our causal model using directed acyclic graph (DAG). We will use packages dagitty and ggdag to create these DAGs and to extract implied independencies and adjustment set. DAG will help us clarify the causal model and introduce the predictors we aim to model (McElreath 2018). This model goes hand in hand with the logic underlying the synthetic data we create below.

Our first hypothesis goes as follows:

H1: Correction recruits more physical effort than the baseline performance.

The relationship of interest is the causal effect of communicative attempt on effort. Further, our causal model includes following assumptions:

  • Personality traits (measured with Big5) will influence effort (e.g., people are more willing to put more effort if they are open-minded) and also communicative attempt (e.g., more extroverted people are better in this game, therefore they need less attempts)

  • Familiarity with guessing partner influences effort (ditto) as well as communicative attempt (e.g., friends are better in this game than strangers)

  • Similarly, participant will also directly influence effort and commAtt, because personality traits might not be capturing all variation

  • Expressibility of concepts is going to influence effort (e.g., more expressible concepts allow more enhancement - but could be also other direction) as well as CommAtt (e.g., more expressible concepts are more readily guessed and don’t need more attempts)

  • Similarly, concept will also directly influence effort and commAtt, because expressibility might not be capturing all variation

  • Modality (uni vs. multi) will influence the value of effort. We assume that in unimodal condition, the feature does not need to account for synergic relations with the other modality, and may carry the whole signal. In multimodal condition, however, there may be synergic relations that moderate the full expressiveness of this feature

  • Trial number (characterising how far one is in the experiment) could be hinting on learning processes through out the experiment, or - the other direction - on increasing fatigue

Note that we are aware that some of the parameters might be correlated and we might run into issues with collinearity. While DAG does not help to solve it, it does help in detecting whether the two potentially highly correlated variables have separate causal effect or share a common cause. If they share common cause, there might be a problem in distinguishing their effects from each other. We will report inter-variable correlations when analyzing the actual data.

These are the implied conditional independencies

Big5 _||_ Conc
Big5 _||_ Expr
Big5 _||_ Fam | Pcn
Big5 _||_ Mod
Big5 _||_ TrNm
Conc _||_ Fam
Conc _||_ Mod
Conc _||_ Pcn
Conc _||_ TrNm
Expr _||_ Fam
Expr _||_ Mod
Expr _||_ Pcn
Expr _||_ TrNm
Fam _||_ Mod
Fam _||_ TrNm
Mod _||_ Pcn
Mod _||_ TrNm
Pcn _||_ TrNm

This is the adjustment set that needs to be included in the model to make sure we are not confounding

{ Big5, Conc, Expr, Fam, Mod, Pcn, TrNum }

Creating synthetic data (for pre-registration purposes only)

Now we can proceed with creating synthetic data, incorporating the relationships we have acknowledged in our causal model. We will then use these data to validate the statistical models. We believe this approach constitutes a principled and transparent method for model validation (McElreath 2018; Morris, White, and Crowther 2019). By evaluating the model structure on simulated data before fitting it to empirical observations, we reduce the risk of overfitting and p-hacking, as model assumptions and inferences are tested independently of the observed data.

Note that the synthetic data do not copy the structure of the data and the experimental design perfectly, but they do provide a reliable ground to build the models and test their performance.

# Set seed for reproducibility
set.seed(0209)

# Set coefficients
b_exp_vocal <- 0.6  # Vocal has lower expressibility
b_exp_multi <- 1.5  # Multimodal has higher expressibility
b_bif <- 1.15  # More extroverted → more effort
b_fam <- 1.10  # More familiarity → more effort
b_exp <- 1.20  # More expressibility → more effort
b_multi <- 0.70  # Multimodal → slightly reduced effort
b_comatt2 <- 1.50  # Effort increases for second attempt
b_comatt3 <- 0.50  # Effort decreases for third attempt
b_prevan <- 0.50  # Higher previous answer similarity → less effort

# Define participants and concepts
n_participants <- 120
n_total_concepts <- 21  # Each participant works with all 21 concepts
n_modalities <- 3  # Gesture, vocal, combined

# Generate participant-level data
participants <- 1:n_participants
Big5 <- runif(n_participants, min = 0, max = 2)
Familiarity <- runif(n_participants, min = 0, max = 2)

# Generate expressibility scores for each concept across modalities
expressibility_matrix <- matrix(runif(n_total_concepts * n_modalities, min = 0, max = 1),
                                nrow = n_total_concepts, ncol = n_modalities)

# Initialize data storage
final_data_synt <- data.frame()

# Define function to simulate participant data
simulate_participant <- function(participant_id) {
  participant_data <- data.frame()
  trial_number <- 1
  
  for (concept_id in 1:n_total_concepts) {
    # Assign a random modality
    modality <- sample(c("gesture", "vocal", "combined"), 1)
    
    # Get expressibility score based on modality
    expressibility_score <- ifelse(modality == "vocal", expressibility_matrix[concept_id, 1] * b_exp_vocal, 
                                   ifelse(modality == "gesture", expressibility_matrix[concept_id, 2], 
                                          expressibility_matrix[concept_id, 3] * b_exp_multi))
    
    # Base effort before adjustments
    base_effort <- b_bif * Big5[participant_id] + 
                   b_fam * Familiarity[participant_id] + 
                   b_exp * expressibility_score + 
                   rnorm(1, mean = 1, sd = 0.5)
    
    # Adjust effort for multimodal condition
    if (modality == "combined") {
      base_effort <- base_effort * b_multi
    }
    
    # Sample the number of communicative attempts (CommAtt)
    adjusted_prob <- c(1 - Familiarity[participant_id],
                       1 - Familiarity[participant_id],
                       1 - Familiarity[participant_id]) * 
                     c(1 - Big5[participant_id],
                       1 - Big5[participant_id],
                       1 - Big5[participant_id]) * 
                     c(1 - expressibility_score,
                       1 - expressibility_score,
                       1 - expressibility_score)
    
    adjusted_prob <- adjusted_prob / sum(adjusted_prob)
    n_attempts <- sample(1:3, 1, prob = adjusted_prob)
    
    prev_answer_similarity <- NA  # First attempt has no previous similarity
    
    # Generate rows for each communicative attempt
    for (attempt in 1:n_attempts) {
      Eff <- base_effort  # Start with base effort
      
      # Modify effort for second and third attempts
      if (attempt == 2) {
        Eff <- Eff * b_comatt2
      } else if (attempt == 3) {
        Eff <- Eff * b_comatt3
      }
      
      # Adjust effort based on previous answer similarity
      if (!is.na(prev_answer_similarity)) {
        Eff <- Eff * (1 + (1 - prev_answer_similarity) * b_prevan)
      }
      
      # Store row
      participant_data <- rbind(participant_data, data.frame(
        Participant = participant_id,
        Concept = concept_id,
        Modality = modality,
        Big5 = Big5[participant_id],
        Familiarity = Familiarity[participant_id],
        Expressibility = expressibility_score,
        CommAtt = attempt,
        Eff = Eff,
        TrialNumber = trial_number,
        PrevAn = prev_answer_similarity
      ))
      
      # Update for next attempt
      trial_number <- trial_number + 1
      prev_answer_similarity <- runif(1, min = 0, max = 1)  # Simulate next similarity
    }
  }
  
  return(participant_data)
}

# Simulate data for all participants
final_data_synt <- do.call(rbind, lapply(participants, simulate_participant))

# Preview results
head(final_data_synt, n=15)
   Participant Concept Modality     Big5 Familiarity Expressibility CommAtt
1            1       1 combined 1.919899    1.843906    0.576172309       1
2            1       1 combined 1.919899    1.843906    0.576172309       2
3            1       1 combined 1.919899    1.843906    0.576172309       3
4            1       2  gesture 1.919899    1.843906    0.349378655       1
5            1       2  gesture 1.919899    1.843906    0.349378655       2
6            1       2  gesture 1.919899    1.843906    0.349378655       3
7            1       3    vocal 1.919899    1.843906    0.339527537       1
8            1       3    vocal 1.919899    1.843906    0.339527537       2
9            1       3    vocal 1.919899    1.843906    0.339527537       3
10           1       4 combined 1.919899    1.843906    0.718778601       1
11           1       4 combined 1.919899    1.843906    0.718778601       2
12           1       5  gesture 1.919899    1.843906    0.078069667       1
13           1       6  gesture 1.919899    1.843906    0.004189679       1
14           1       6  gesture 1.919899    1.843906    0.004189679       2
15           1       6  gesture 1.919899    1.843906    0.004189679       3
         Eff TrialNumber     PrevAn
1   4.343849           1         NA
2   6.566250           2 0.98450637
3   2.367010           3 0.82035722
4   5.497853           4         NA
5  10.367605           5 0.48565959
6   3.998448           6 0.09090184
7   6.153093           7         NA
8  12.934986           8 0.19707656
9   3.593379           9 0.66401787
10  4.094773          10         NA
11  6.406363          11 0.91397050
12  5.275766          12         NA
13  5.605587          13         NA
14 12.021459          14 0.14060042
15  3.439740          15 0.54549131

Exploring the synthetic data

This is the distribution of our simulated effort feature

This is the relationship between effort and communicative attempt (H1) as seen in the raw (synthetic) data

H1: Modelling

In this part, we are going to test the following hypothesis:

H1: Correction recruits more physical effort than the baseline performance.

We will fit Bayesian mixed effect models using brms package (Bürkner 2017).

Data wrangling

To be able to interpret the data, we will first need to do some data wrangling

Convert columns to factors

final_data$CommAtt <- as.factor(final_data$CommAtt)
final_data$Modality <- as.factor(final_data$Modality)
final_data$Participant <- as.factor(final_data$Participant)
final_data$Concept <- as.factor(final_data$Concept)

final_data$TrialNumber <- as.numeric(final_data$TrialNumber)  # Ensure TrialNumber is numeric

Contrast-coding of categorical variables

contrasts(final_data$CommAtt) <- MASS::contr.sdif(3)
contrasts(final_data$Modality) <- contr.sum(3)/2
Note

This is how CommAtt is contrast-coded now

2-1 3-2

1 -0.6666667 -0.3333333 2 0.3333333 -0.3333333 3 0.3333333 0.6666667

This is how modality is cc-ed

[,1] [,2]

combined 0.5 0.0 gesture 0.0 0.5 vocal -0.5 -0.5

Centering continuous variables

final_data$TrialNumber_c <- final_data$TrialNumber - median(range(final_data$TrialNumber))
final_data$Familiarity <- final_data$Familiarity - median(range(final_data$Familiarity))
final_data$Big5 <- final_data$Big5 - median(range(final_data$Big5))
# For now, we will just center Familiarity and Big5 because we created them synthetically. But the real data have these two variables already z-scored

Z-scoring expressibility within a modality

final_data <-
  final_data |>
  group_by(Modality) |>
  mutate(Expressibility_z = (Expressibility - mean(Expressibility))/ sd(final_data$Expressibility, na.rm = T)) |>
  ungroup()

Model 1 - Simple reproduction of DAG

First we want to do a simple model that reproduce our DAG.

Our main predictor is communicative attempt (CommAtt). To account for confounders - variables affecting both the predictor and the dependent variable - we need to adjust for them in the model by including them as covariates to isolate the effect of the predictor. Based on our DAG, confounders include:

  1. familiarity
  2. big5
  3. expressibility
  4. trial number
  5. modality
  6. concept
  7. participant

We include 1.-5. as fixed factors. For 6.-7., we include varying intercepts as we expect that each participant and concept may have they own baseline level of effort and thus allow for individual variation. Partial pooling is also beneficial in that extreme values (or categories will fewer datapoints) will be shrunk toward the overal average.

We will now not include PrevAn (previous answer) variable because we will need to do some further data-wrangling when building model for H2. That is mainly because PrevAn has some NA values, concretely for first attempts. Models would automatically exclude NA data, and we would therefore loose all effort data for attempt 1. For H1, however, we want to keep it there.

This is our first model without setting any priors, leaving them to default values

h1.m1 <- brm(Eff ~ 1 + CommAtt + Familiarity + Big5 + Expressibility_z + TrialNumber_c + Modality + (1 | Participant) + (1 | Concept),
                data = final_data,
                iter = 4000,
                cores = 4)

# Add criterions for later diagnostics
h1.m1 <- add_criterion(h1.m1, criterion = c("loo", "waic"))

# Calculate also variance explained (R^2)
h1.m1_R2 <- bayes_R2(h1.m1)

# Save both as objects
saveRDS(h1.m1, here("09_Analysis_Modeling", "models", "h1.m1.rds"))
saveRDS(h1.m1_R2, here("09_Analysis_Modeling", "models", "h1.m1_R2.rds"))

beep(5)
h1.m1 <- readRDS(here("09_Analysis_Modeling", "models", "h1.m1.rds"))
h1.m1_R2 <- readRDS(here("09_Analysis_Modeling", "models", "h1.m1_R2.rds"))


# Summary
summary(h1.m1)
 Family: gaussian 
  Links: mu = identity; sigma = identity 
Formula: Eff ~ 1 + CommAtt + Familiarity + Big5 + Expressibility_z + TrialNumber_c + Modality + (1 | Participant) + (1 | Concept) 
   Data: final_data (Number of observations: 5076) 
  Draws: 4 chains, each with iter = 4000; warmup = 2000; thin = 1;
         total post-warmup draws = 8000

Multilevel Hyperparameters:
~Concept (Number of levels: 21) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)     0.04      0.02     0.00     0.09 1.00     2122     1863

~Participant (Number of levels: 120) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)     0.03      0.02     0.00     0.07 1.00     2320     3475

Regression Coefficients:
                 Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept            4.02      0.02     3.98     4.06 1.00     6694     5622
CommAtt2M1           3.08      0.03     3.03     3.14 1.00     8963     6147
CommAtt3M2          -4.39      0.04    -4.46    -4.32 1.00     8289     5880
Familiarity          1.26      0.02     1.21     1.30 1.00     8380     5226
Big5                 1.29      0.02     1.25     1.34 1.00     9979     6018
Expressibility_z     0.46      0.02     0.43     0.49 1.00     9888     5989
TrialNumber_c       -0.00      0.00    -0.00     0.00 1.00     7595     5984
Modality1           -1.30      0.04    -1.38    -1.23 1.00     8559     5997
Modality2            0.64      0.04     0.57     0.71 1.00     7347     5787

Further Distributional Parameters:
      Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma     0.91      0.01     0.89     0.93 1.00    11791     5667

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
Note

Note on modality: Modality1 represents the baseline difference between combined and the other modalities, and Modality2 represents the difference between gesture and vocal compared to combined.

Transforming coefficients

To be able to link these estimates back to the simulation coefficients, let’s create a function

transform_attempt <- function(intercept, CommAtt2M1, CommAtt3M2) {
  # Effort for the first attempt (base effort)
  Eff_attempt_1 <- intercept
  
  # Effort for the second attempt
  Eff_attempt_2 <- Eff_attempt_1 + CommAtt2M1
  
  # Effort for the third attempt
  Eff_attempt_3 <- Eff_attempt_1 + CommAtt2M1 + CommAtt3M2
  
  # Calculate ratios
  b_attempt2 <- Eff_attempt_2 / Eff_attempt_1
  b_attempt3 <- Eff_attempt_3 / Eff_attempt_2
  
  return(data.frame(b_attempt2, b_attempt3))
}

h1m1_coeff <- transform_attempt(4.02, 3.08, -4.39)
print(h1m1_coeff)
  b_attempt2 b_attempt3
1   1.766169  0.3816901
# For centered familiarity
fam = (4.02 + 1.26) / 4.02
print(fam)
[1] 1.313433
# For centered BIF
bif = (4.02 + 1.29) / 4.02
print(bif)
[1] 1.320896
Note

Expressibility is z-scored so we will not get to the coefficient in the same way but we can check the conditinal effects for checking whether it looks good

Let’s also check the visuals

plot(h1.m1)

# all caterpillars look nice

plot(conditional_effects(h1.m1), points = TRUE)

# the effects all go in good direction

pp_check(h1.m1, type = "dens_overlay")

# Looks good but not amazing - mostly because the posteriors seem to not know effort cannot be negative

pp_check(h1.m1, type = "error_scatter_avg")

# half blobby, half correlated, so still some room for improvement
# positive correlation means that errors increase with predicted values. So the model does perform well for some range, but becomes less reliable with increase in the predicted values
# also the blob is centered around 0 which is good
# it could be we are ignoring some interaction terms or non-linearity (which we know we kind of do). Transformation could also help (e.g., log). Of course, we are also still not specifying any priors so let's not yet make it a disaster

h1.m1_R2
    Estimate   Est.Error      Q2.5     Q97.5
R2 0.8381496 0.001662319 0.8348364 0.8412852
# explained variance around 83%

Overall, we see good directions of all predictors, mostly also in accordance with the expected coefficients. Of course, the synthetic data is quite complex so there might be other dependencies that moderate the causal relationships and that is why we do not see exactly the numbers we use to create the data.

Let’s have another model for comparison.

We can assume that participants and concept have not only different baselines of effort (varying intercept). The effect of CommAtt on effort might vary across them too, hence we can try to add varying slopes for them and see whether the diagnostics improves. We will also add TrialNumber as a varying intercept, because we expect variation between earlier and later performances (because of learning, or opposite, fatigue).

Model 2 - Varying slopes and intercepts

h1.m2 <- brm(Eff ~ 1 + CommAtt + Modality + Big5 + Familiarity + Expressibility_z +  (1 + CommAtt | Participant) + (1 + CommAtt | Concept) + (1 | TrialNumber_c), 
                data = final_data,
                iter = 4000,
                cores = 4)

# Add criterions for later diagnostics
h1.m2 <- add_criterion(h1.m2, criterion = c("loo", "waic"))

# Calculate also variance explained (R^2)
h1.m2_R2 <- bayes_R2(h1.m2)

# Save both as objects
saveRDS(h1.m2, here("09_Analysis_Modeling", "models", "h1.m2.rds"))
saveRDS(h1.m2_R2, here("09_Analysis_Modeling", "models", "h1.m2_R2.rds"))

beep(5)
h1.m2 <- readRDS(here("09_Analysis_Modeling", "models", "h1.m2.rds"))
h1.m2_R2 <- readRDS(here("09_Analysis_Modeling", "models", "h1.m2_R2.rds"))

# Summary
summary(h1.m2)
 Family: gaussian 
  Links: mu = identity; sigma = identity 
Formula: Eff ~ 1 + CommAtt + Modality + Big5 + Familiarity + Expressibility_z + (1 + CommAtt | Participant) + (1 + CommAtt | Concept) + (1 | TrialNumber_c) 
   Data: final_data (Number of observations: 5076) 
  Draws: 4 chains, each with iter = 4000; warmup = 2000; thin = 1;
         total post-warmup draws = 8000

Multilevel Hyperparameters:
~Concept (Number of levels: 21) 
                           Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
sd(Intercept)                  0.06      0.02     0.02     0.10 1.00     2791
sd(CommAtt2M1)                 0.17      0.04     0.11     0.26 1.00     3239
sd(CommAtt3M2)                 0.25      0.06     0.16     0.37 1.00     3386
cor(Intercept,CommAtt2M1)      0.52      0.28    -0.14     0.93 1.00     1312
cor(Intercept,CommAtt3M2)     -0.35      0.32    -0.90     0.32 1.00     1261
cor(CommAtt2M1,CommAtt3M2)    -0.93      0.07    -1.00    -0.75 1.00     5691
                           Tail_ESS
sd(Intercept)                  2048
sd(CommAtt2M1)                 4832
sd(CommAtt3M2)                 5133
cor(Intercept,CommAtt2M1)      1617
cor(Intercept,CommAtt3M2)      1450
cor(CommAtt2M1,CommAtt3M2)     6716

~Participant (Number of levels: 120) 
                           Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
sd(Intercept)                  0.15      0.04     0.08     0.22 1.00      901
sd(CommAtt2M1)                 0.78      0.06     0.68     0.90 1.00     2066
sd(CommAtt3M2)                 1.10      0.08     0.95     1.27 1.00     2076
cor(Intercept,CommAtt2M1)      0.96      0.03     0.87     1.00 1.01      352
cor(Intercept,CommAtt3M2)     -0.96      0.04    -1.00    -0.85 1.01      353
cor(CommAtt2M1,CommAtt3M2)    -1.00      0.00    -1.00    -0.99 1.00     7139
                           Tail_ESS
sd(Intercept)                  2511
sd(CommAtt2M1)                 3409
sd(CommAtt3M2)                 3623
cor(Intercept,CommAtt2M1)       596
cor(Intercept,CommAtt3M2)       690
cor(CommAtt2M1,CommAtt3M2)     6841

~TrialNumber_c (Number of levels: 50) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)     0.02      0.02     0.00     0.06 1.00     3119     3942

Regression Coefficients:
                 Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept            4.03      0.02     3.99     4.08 1.00     2093     3494
CommAtt2M1           3.07      0.08     2.90     3.23 1.00     1384     2570
CommAtt3M2          -4.36      0.12    -4.59    -4.13 1.00     1411     2477
Modality1           -1.29      0.03    -1.35    -1.23 1.00     8986     6966
Modality2            0.64      0.03     0.58     0.71 1.00     9767     6458
Big5                 1.08      0.05     0.99     1.16 1.01      854     2264
Familiarity          1.04      0.05     0.94     1.12 1.00      957     2173
Expressibility_z     0.44      0.02     0.41     0.47 1.00     5445     5281

Further Distributional Parameters:
      Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma     0.80      0.01     0.78     0.81 1.00    11536     6199

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
# Coefficients remain mostly unchanged but there were some divergent transitions
plot(h1.m2)

# some some of the caterpillars are not that pretty anymore

plot(conditional_effects(h1.m2), points = TRUE)

# the effects all go in good direction

pp_check(h1.m2, type = "dens_overlay")

# Looks good but not amazing - mostly because the posteriors seem to not know effort cannot be negative

pp_check(h1.m2, type = "error_scatter_avg")

# This looks a bit more blobby than before but still lots of errors for higher values

h1.m2_R2
    Estimate  Est.Error      Q2.5     Q97.5
R2 0.8760573 0.00125936 0.8734939 0.8784314
# explained variance around 87%

One of the reasons why we might be getting the divergent transition is because of the correlation between slopes and intercepts in the previous model. Let’s now get rid of it

Model 3 - No correlation coefficient

h1.m3 <- brm(Eff ~ 1 + CommAtt + Modality + Big5 + Familiarity + Expressibility_z +  (1 + CommAtt || Participant) + (1 + CommAtt || Concept) + (1 || TrialNumber_c), 
                data = final_data,
                iter = 4000,
                cores = 4)
                
# Add criterions for later diagnostics
h1.m3 <- add_criterion(h1.m3, criterion = c("loo", "waic"))

# Calculate also variance explained (R^2)
h1.m3_R2 <- bayes_R2(h1.m3)

# Save both as objects
saveRDS(h1.m3, here("09_Analysis_Modeling", "models", "h1.m3.rds"))
saveRDS(h1.m3_R2, here("09_Analysis_Modeling", "models", "h1.m3_R2.rds"))

beep(5)
h1.m3 <- readRDS(here("09_Analysis_Modeling", "models", "h1.m3.rds"))
h1.m3_R2 <- readRDS(here("09_Analysis_Modeling", "models", "h1.m3_R2.rds"))

# Summary
summary(h1.m3)
 Family: gaussian 
  Links: mu = identity; sigma = identity 
Formula: Eff ~ 1 + CommAtt + Modality + Big5 + Familiarity + Expressibility_z + (1 + CommAtt || Participant) + (1 + CommAtt || Concept) + (1 || TrialNumber_c) 
   Data: final_data (Number of observations: 5076) 
  Draws: 4 chains, each with iter = 4000; warmup = 2000; thin = 1;
         total post-warmup draws = 8000

Multilevel Hyperparameters:
~Concept (Number of levels: 21) 
               Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)      0.05      0.02     0.01     0.10 1.00     2648     2360
sd(CommAtt2M1)     0.09      0.05     0.01     0.19 1.01     1159     1540
sd(CommAtt3M2)     0.17      0.06     0.04     0.30 1.00     1422     1517

~Participant (Number of levels: 120) 
               Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)      0.03      0.02     0.00     0.08 1.00     2442     4123
sd(CommAtt2M1)     0.71      0.05     0.61     0.82 1.00     2733     4516
sd(CommAtt3M2)     1.05      0.08     0.91     1.21 1.00     2231     3994

~TrialNumber_c (Number of levels: 50) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)     0.02      0.02     0.00     0.06 1.00     3301     3712

Regression Coefficients:
                 Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept            4.03      0.02     3.99     4.06 1.00     9303     6039
CommAtt2M1           3.07      0.07     2.93     3.22 1.00     2152     3434
CommAtt3M2          -4.36      0.11    -4.58    -4.15 1.00     2100     3982
Modality1           -1.30      0.03    -1.36    -1.24 1.00    11936     6326
Modality2            0.65      0.03     0.58     0.71 1.00    12551     6401
Big5                 1.23      0.02     1.19     1.28 1.00    15911     6657
Familiarity          1.20      0.02     1.15     1.24 1.00    14065     6574
Expressibility_z     0.45      0.01     0.42     0.48 1.00    14842     5619

Further Distributional Parameters:
      Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma     0.81      0.01     0.80     0.83 1.00    11557     5519

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
# Coefficients remain mostly unchanged 
plot(h1.m3)

# no correlation fixed the performance

plot(conditional_effects(h1.m3), points = TRUE)

# the effects all go in good direction

pp_check(h1.m3, type = "dens_overlay")

# the problem with predicting negative values remains

pp_check(h1.m3, type = "error_scatter_avg")

# unchanged

h1.m3_R2
    Estimate   Est.Error      Q2.5    Q97.5
R2 0.8714492 0.001445024 0.8685275 0.874215
# explained variance remains around 87%

Model 3.1 - adding priors

Now we will add some minimal priors, giving the model some reasonably informative, yet weak information.

priors_h1m3p <- c(
  set_prior("normal(2.5, 0.5)", class = "Intercept", lb=0),
  set_prior("normal(0,0.50)", class = "b", coef = "CommAtt2M1"),
  set_prior("normal(0,0.50)", class = "b", coef = "CommAtt3M2"),
  set_prior("normal(0,0.25)", class = "b", coef = "Modality1"),
  set_prior("normal(0,0.25)", class = "b", coef = "Modality2"),
  set_prior("normal(0,0.25)", class = "b", coef = "Big5"),
  set_prior("normal(0,0.25)", class = "b", coef = "Familiarity"),
  set_prior("normal(0,0.25)", class = "b", coef = "Expressibility_z"),
  
  set_prior("normal(0.5,0.1)", class = "sd", group = "TrialNumber_c"),
  set_prior("normal(0.5,0.1)", class = "sd", group = "Participant"),
  set_prior("normal(0.5,0.1)", class = "sd", group = "Concept"),
  set_prior("normal(1,0.1)", class = "sd"),
  
  set_prior("normal(0.5,0.25)", class = "sigma")
  
)

h1.m3p <- brm(Eff ~ 1 + CommAtt + Modality + Big5 + Familiarity + Expressibility_z +  (1 + CommAtt || Participant) + (1 + CommAtt || Concept) + (1 || TrialNumber_c), 
                data = final_data,
                prior=priors_h1m3p,
                family = gaussian,
                iter = 4000,
                cores = 4)


# Add criterions for later diagnostics
h1.m3p <- add_criterion(h1.m3p, criterion = c("loo", "waic"))

# Calculate also variance explained (R^2)
h1.m3p_R2 <- bayes_R2(h1.m3p)

# Save both as objects
saveRDS(h1.m3p, here("09_Analysis_Modeling", "models", "h1.m3p.rds"))
saveRDS(h1.m3p_R2, here("09_Analysis_Modeling", "models", "h1.m3p_R2.rds"))

beep(5)
h1.m3p <- readRDS(here("09_Analysis_Modeling", "models", "h1.m3p.rds"))
h1.m3p_R2 <- readRDS(here("09_Analysis_Modeling", "models", "h1.m3p_R2.rds"))


# Summary
summary(h1.m3p)
 Family: gaussian 
  Links: mu = identity; sigma = identity 
Formula: Eff ~ 1 + CommAtt + Modality + Big5 + Familiarity + Expressibility_z + (1 + CommAtt || Participant) + (1 + CommAtt || Concept) + (1 || TrialNumber_c) 
   Data: final_data (Number of observations: 5076) 
  Draws: 4 chains, each with iter = 4000; warmup = 2000; thin = 1;
         total post-warmup draws = 8000

Multilevel Hyperparameters:
~Concept (Number of levels: 21) 
               Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)      0.08      0.03     0.03     0.14 1.00     2732     4188
sd(CommAtt2M1)     0.22      0.06     0.12     0.36 1.00     2853     4652
sd(CommAtt3M2)     0.34      0.08     0.20     0.51 1.00     2590     5063

~Participant (Number of levels: 120) 
               Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)      0.06      0.02     0.01     0.10 1.00     2220     2200
sd(CommAtt2M1)     0.66      0.04     0.58     0.75 1.00     4131     5851
sd(CommAtt3M2)     0.89      0.05     0.80     0.98 1.00     4638     6332

~TrialNumber_c (Number of levels: 50) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)     0.04      0.02     0.00     0.08 1.00     2687     3691

Regression Coefficients:
                 Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept            4.02      0.02     3.97     4.07 1.00     8469     6434
CommAtt2M1           2.99      0.08     2.83     3.15 1.00     2795     4372
CommAtt3M2          -4.12      0.12    -4.35    -3.87 1.00     2401     3679
Modality1           -1.27      0.03    -1.34    -1.21 1.00    13767     6673
Modality2            0.63      0.03     0.56     0.69 1.00    13131     6444
Big5                 1.23      0.02     1.18     1.28 1.00    12639     6019
Familiarity          1.19      0.03     1.14     1.24 1.00    13068     6412
Expressibility_z     0.45      0.02     0.42     0.48 1.00    15210     5985

Further Distributional Parameters:
      Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma     0.81      0.01     0.80     0.83 1.00    14125     5732

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
# Coefficients remain mostly unchanged 
plot(h1.m3p)

# all good

plot(conditional_effects(h1.m3p), points = TRUE)

# the effects all go in good direction

pp_check(h1.m3p, type = "dens_overlay")

# the problem with predicting negative values remains

pp_check(h1.m3p, type = "error_scatter_avg")

# unchanged

h1.m3p_R2
    Estimate   Est.Error      Q2.5    Q97.5
R2 0.8705705 0.001478901 0.8676255 0.873398
# explained variance remains around 87%

Model 4 - Restricting priors with exponential distribution

Let’s do one more test with the priors, restricting sd them with exponential distribution, i.e., negative values are not possible

priors_h1m4 <- c(
  set_prior("normal(3, 0.3)", class = "Intercept", lb = 0),
  set_prior("normal(0, 0.25)", class = "b", coef = "CommAtt2M1"),
  set_prior("normal(0, 0.25)", class = "b", coef = "CommAtt3M2"),
  set_prior("normal(0, 0.15)", class = "b", coef = "Modality1"),
  set_prior("normal(0, 0.15)", class = "b", coef = "Modality2"),
  set_prior("normal(0, 0.15)", class = "b", coef = "Big5"),
  set_prior("normal(0, 0.15)", class = "b", coef = "Familiarity"),
  set_prior("normal(0, 0.15)", class = "b", coef = "Expressibility_z"),

  # Exponential priors for standard deviations
  set_prior("exponential(3)", class = "sd", group = "TrialNumber_c"), # exp(3) has a mean of 1/3 and concentrates most density around small values
  set_prior("exponential(3)", class = "sd", group = "Participant"),
  set_prior("exponential(3)", class = "sd", group = "Concept"),
  set_prior("exponential(1)", class = "sd"),  # Generic sd prior

  # Residual standard deviation - keep it narrow
  set_prior("normal(0.5, 0.1)", class = "sigma")
)

h1.m4 <- brm(Eff ~ 1 + CommAtt + Modality + Big5 + Familiarity + Expressibility_z +  (1 + CommAtt || Participant) + (1 + CommAtt || Concept) + (1 || TrialNumber_c), 
                data = final_data,
                prior=priors_h1m4,
                family = gaussian,
                iter = 4000,
                cores = 4)


# Add criterions for later diagnostics
h1.m4 <- add_criterion(h1.m4, criterion = c("loo", "waic"))

# Calculate also variance explained (R^2)
h1.m4_R2 <- bayes_R2(h1.m4)

# Save both as objects
saveRDS(h1.m4, here("09_Analysis_Modeling", "models", "h1.m4.rds"))
saveRDS(h1.m4_R2, here("09_Analysis_Modeling", "models", "h1.m4_R2.rds"))

beep(5)
h1.m4 <- readRDS(here("09_Analysis_Modeling", "models", "h1.m4.rds"))
h1.m4_R2 <- readRDS(here("09_Analysis_Modeling", "models", "h1.m4_R2.rds"))


# Summary
summary(h1.m4)
 Family: gaussian 
  Links: mu = identity; sigma = identity 
Formula: Eff ~ 1 + CommAtt + Modality + Big5 + Familiarity + Expressibility_z + (1 + CommAtt || Participant) + (1 + CommAtt || Concept) + (1 || TrialNumber_c) 
   Data: final_data (Number of observations: 5076) 
  Draws: 4 chains, each with iter = 4000; warmup = 2000; thin = 1;
         total post-warmup draws = 8000

Multilevel Hyperparameters:
~Concept (Number of levels: 21) 
               Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)      0.05      0.02     0.01     0.09 1.00     2376     2462
sd(CommAtt2M1)     1.98      0.37     1.33     2.78 1.00     1362     2663
sd(CommAtt3M2)     3.36      0.46     2.56     4.38 1.00      942     2031

~Participant (Number of levels: 120) 
               Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)      0.03      0.02     0.00     0.08 1.00     2217     4211
sd(CommAtt2M1)     0.71      0.05     0.61     0.82 1.00     2746     4486
sd(CommAtt3M2)     1.05      0.08     0.92     1.22 1.00     2446     4360

~TrialNumber_c (Number of levels: 50) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)     0.02      0.02     0.00     0.06 1.00     3383     3814

Regression Coefficients:
                 Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept            4.02      0.02     3.99     4.06 1.00     9553     6582
CommAtt2M1           0.88      0.30     0.31     1.47 1.00     4070     5104
CommAtt3M2          -0.36      0.26    -0.87     0.14 1.00    10008     6008
Modality1           -1.23      0.03    -1.29    -1.16 1.00    14208     6281
Modality2            0.59      0.03     0.53     0.65 1.00    15010     6334
Big5                 1.21      0.02     1.16     1.25 1.00    13972     6575
Familiarity          1.17      0.02     1.12     1.21 1.00    13947     5839
Expressibility_z     0.44      0.01     0.41     0.47 1.00    17170     6516

Further Distributional Parameters:
      Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma     0.81      0.01     0.79     0.83 1.00    15722     5841

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
# the coefficients for commatt now seem much lower
h1m1_coeff <- transform_attempt(4.02, 0.88, -0.36)
print(h1m1_coeff)
  b_attempt2 b_attempt3
1   1.218905  0.9265306
# For centered familiarity
fam = (4.02 + 1.17) / 4.02
print(fam)
[1] 1.291045
# For centered BIF
bif = (4.02 + 1.21) / 4.02
print(bif)
[1] 1.300995

So it seems that we are misleading with the priors and this will not be the way

plot(h1.m4)

# all good

plot(conditional_effects(h1.m4), points = TRUE)

# the effect of main predictor is now very moderated 

pp_check(h1.m4, type = "dens_overlay")

# the problem with predicting negative values remains

pp_check(h1.m4, type = "error_scatter_avg")

# unchanged

h1.m4_R2
    Estimate   Est.Error     Q2.5    Q97.5
R2 0.8701813 0.001472199 0.867155 0.872996
# explained variance remains around 87%

Model 5 - student family

We still see negative values in the posterior simulations, so let’s try Student’s t-distribution which is more robust to outliers and can potentially reduce the likelihood of negative values (if we reduce degrees of freedom)

priors_h1m5 <- c(
  set_prior("normal(2.5, 0.5)", class = "Intercept", lb=0),
  set_prior("normal(0,0.50)", class = "b", coef = "CommAtt2M1"),
  set_prior("normal(0,0.50)", class = "b", coef = "CommAtt3M2"),
  set_prior("normal(0,0.25)", class = "b", coef = "Modality1"),
  set_prior("normal(0,0.25)", class = "b", coef = "Modality2"),
  set_prior("normal(0,0.25)", class = "b", coef = "Big5"),
  set_prior("normal(0,0.25)", class = "b", coef = "Familiarity"),
  set_prior("normal(0,0.25)", class = "b", coef = "Expressibility_z"),
  
  set_prior("normal(0.5,0.05)", class = "sd", group = "TrialNumber_c"),
  set_prior("normal(0.5,0.05)", class = "sd", group = "Participant"),
  set_prior("normal(0.5,0.05)", class = "sd", group = "Concept"),
  set_prior("normal(1,0.05)", class = "sd"),
  
  set_prior("normal(0.5,0.1)", class = "sigma"),
  set_prior("gamma(2, 0.1)", class = "nu")  # Prior for degrees of freedom
  
)

h1.m5 <- brm(Eff ~ 1 + CommAtt + Modality + Big5 + Familiarity + Expressibility_z +  (1 + CommAtt || Participant) + (1 + CommAtt || Concept) + (1 || TrialNumber_c), 
                data = final_data,
                prior=priors_h1m5,
                family = student,
                iter = 4000,
                cores = 4)


# Add criterions for later diagnostics
h1.m5 <- add_criterion(h1.m5, criterion = c("loo", "waic"))

# Calculate also variance explained (R^2)
h1.m5_R2 <- bayes_R2(h1.m5)

# Save both as objects
saveRDS(h1.m5, here("09_Analysis_Modeling", "models", "h1.m5.rds"))
saveRDS(h1.m5_R2, here("09_Analysis_Modeling", "models", "h1.m5_R2.rds"))

beep(5)
h1.m5 <- readRDS(here("09_Analysis_Modeling", "models", "h1.m5.rds"))
h1.m5_R2 <- readRDS(here("09_Analysis_Modeling", "models", "h1.m5_R2.rds"))


# Summary
summary(h1.m5)
 Family: student 
  Links: mu = identity; sigma = identity; nu = identity 
Formula: Eff ~ 1 + CommAtt + Modality + Big5 + Familiarity + Expressibility_z + (1 + CommAtt || Participant) + (1 + CommAtt || Concept) + (1 || TrialNumber_c) 
   Data: final_data (Number of observations: 5076) 
  Draws: 4 chains, each with iter = 4000; warmup = 2000; thin = 1;
         total post-warmup draws = 8000

Multilevel Hyperparameters:
~Concept (Number of levels: 21) 
               Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)      0.38      0.06     0.26     0.49 1.00     5974     5242
sd(CommAtt2M1)     0.42      0.05     0.32     0.53 1.00     8447     6363
sd(CommAtt3M2)     0.46      0.05     0.36     0.56 1.00     8880     6408

~Participant (Number of levels: 120) 
               Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)      0.13      0.02     0.10     0.17 1.00     3617     5313
sd(CommAtt2M1)     0.57      0.03     0.51     0.63 1.00     6105     6010
sd(CommAtt3M2)     0.72      0.03     0.66     0.79 1.00     7296     6471

~TrialNumber_c (Number of levels: 50) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)     0.09      0.03     0.04     0.15 1.00     1649     2746

Regression Coefficients:
                 Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept            3.92      0.08     3.75     4.08 1.00     1736     3195
CommAtt2M1           2.78      0.11     2.56     2.99 1.00     2927     3842
CommAtt3M2          -3.95      0.13    -4.19    -3.70 1.00     2998     3937
Modality1           -1.11      0.03    -1.17    -1.06 1.00    13750     6634
Modality2            0.56      0.03     0.51     0.62 1.00    15623     6248
Big5                 1.15      0.03     1.10     1.21 1.00    10082     6655
Familiarity          1.13      0.03     1.07     1.18 1.00     9422     6474
Expressibility_z     0.39      0.01     0.36     0.41 1.00    16992     6093

Further Distributional Parameters:
      Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma     0.52      0.01     0.50     0.55 1.00    11160     6372
nu        2.96      0.16     2.67     3.28 1.00    13245     5813

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
# the coefficients look ok again. Familiarity and Big5 even got closer to our simulated betas
plot(h1.m5)

# all good

plot(conditional_effects(h1.m5), points = TRUE)

# the effects all go in good direction

pp_check(h1.m5, type = "dens_overlay")

# the fit seems somewhat better than with gaussian, but still negative values are there

pp_check(h1.m5, type = "error_scatter_avg")

# unchanged

h1.m5_R2
    Estimate   Est.Error    Q2.5     Q97.5
R2 0.8541097 0.002443239 0.84921 0.8587499
# explained variance remains around 85%

Model 6 - log-normal distribution

We have already seen - when plotting - that effort probably tends towards lognormal distribution. We will keep the model constant, just exchange the distribution. Priors are kept default

h1.m6 <- brm(Eff ~ 1 + CommAtt + Modality + Big5 + Familiarity + Expressibility_z +  (1 + CommAtt || Participant) + (1 + CommAtt || Concept) + (1 || TrialNumber_c), 
                data = final_data,
                family = lognormal(),
                iter = 4000,
                cores = 4)


# Add criterions for later diagnostics
h1.m6 <- add_criterion(h1.m6, criterion = c("loo", "waic"))

# Calculate also variance explained (R^2)
h1.m6_R2 <- bayes_R2(h1.m6)

# Save both as objects
saveRDS(h1.m6, here("09_Analysis_Modeling", "models", "h1.m6.rds"))
saveRDS(h1.m6_R2, here("09_Analysis_Modeling", "models", "h1.m6_R2.rds"))

beep(5)
h1.m6 <- readRDS(here("09_Analysis_Modeling", "models", "h1.m6.rds"))
h1.m6_R2 <- readRDS(here("09_Analysis_Modeling", "models", "h1.m6_R2.rds"))


# Summary
summary(h1.m6)
 Family: lognormal 
  Links: mu = identity; sigma = identity 
Formula: Eff ~ 1 + CommAtt + Modality + Big5 + Familiarity + Expressibility_z + (1 + CommAtt || Participant) + (1 + CommAtt || Concept) + (1 || TrialNumber_c) 
   Data: final_data (Number of observations: 5076) 
  Draws: 4 chains, each with iter = 4000; warmup = 2000; thin = 1;
         total post-warmup draws = 8000

Multilevel Hyperparameters:
~Concept (Number of levels: 21) 
               Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)      0.01      0.00     0.00     0.02 1.00     2298     2414
sd(CommAtt2M1)     0.00      0.00     0.00     0.01 1.00     4803     3682
sd(CommAtt3M2)     0.01      0.01     0.00     0.02 1.00     3687     3650

~Participant (Number of levels: 120) 
               Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)      0.05      0.00     0.04     0.06 1.00     2844     4508
sd(CommAtt2M1)     0.00      0.00     0.00     0.01 1.00     4417     3655
sd(CommAtt3M2)     0.01      0.01     0.00     0.03 1.00     2618     2888

~TrialNumber_c (Number of levels: 50) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)     0.00      0.00     0.00     0.01 1.00     3411     3578

Regression Coefficients:
                 Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept            1.23      0.01     1.22     1.24 1.00     3120     4439
CommAtt2M1           0.62      0.01     0.61     0.63 1.00     9634     6358
CommAtt3M2          -1.10      0.01    -1.11    -1.08 1.00     9514     5588
Modality1           -0.31      0.01    -0.33    -0.30 1.00     9097     6088
Modality2            0.16      0.01     0.15     0.17 1.00     9223     6512
Big5                 0.32      0.01     0.31     0.34 1.00     2924     4558
Familiarity          0.32      0.01     0.31     0.34 1.00     2297     3545
Expressibility_z     0.12      0.00     0.12     0.13 1.00    10923     6710

Further Distributional Parameters:
      Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma     0.17      0.00     0.17     0.18 1.00    11348     5998

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
# Coefficients remain mostly unchanged 

Converting to original scale

Of course now we have all on lognormal scale so it is not so straightforwadly interpretable. This is how we can get to the values of an effort under certain value of a predictor

# Compute estimated marginal means on log-scale
em_h1.m6 <- emmeans(h1.m6, ~ CommAtt) #~ CommAtt

#Backtransform the post.beta values
em_h1.m6@post.beta <- exp(em_h1.m6@post.beta)
print(em_h1.m6)
 CommAtt emmean lower.HPD upper.HPD
 1         3.33      3.29      3.37
 2         6.21      6.13      6.29
 3         2.08      2.05      2.11

Results are averaged over the levels of: Modality 
Point estimate displayed: median 
HPD interval probability: 0.95 

So we indeed see that effort in CommAtt2 increase but then decreases again for CommAtt3. We can also try to get our simulated betas

coeff1 <- 3.33*1.5 
coeff2 <- 3.33*0.5

print(coeff1)
[1] 4.995
print(coeff2) # which is very close to the mean values we see from the model
[1] 1.665

We see that the effect estimated by the model is now stronger than in our simulated data. However, note that these values are averaged over predictor modality

We can also try a different method to get the coefficients (code adapted from here)

# Extract posterior samples
alpha_samples <- as_draws_df(h1.m6)$b_Intercept
beta_2_vs_1 <- as_draws_df(h1.m6)$b_CommAtt2M1
beta_3_vs_2 <- as_draws_df(h1.m6)$b_CommAtt3M2

# Compute expected values on the log scale
mu_1 <- alpha_samples  # CommAtt 1
mu_2 <- alpha_samples + beta_2_vs_1  # CommAtt 2
mu_3 <- alpha_samples + beta_2_vs_1 + beta_3_vs_2  # CommAtt 3

# Transform to original scale
effect_1 <- exp(mu_1)
effect_2 <- exp(mu_2)
effect_3 <- exp(mu_3)

# Calculate contrasts on the original scale
effect_diff_2_vs_1 <- effect_2 - effect_1
effect_diff_3_vs_2 <- effect_3 - effect_2
effect_diff_3_vs_1 <- effect_3 - effect_1

# Summarize the effects
list(
  mean_intercept = mean(effect_1),
  mean_diff_2_vs_1 = c(mean = mean(effect_diff_2_vs_1), quantile(effect_diff_2_vs_1, c(0.025, 0.975))),
  mean_diff_3_vs_2 = c(mean = mean(effect_diff_3_vs_2), quantile(effect_diff_3_vs_2, c(0.025, 0.975))),
  mean_diff_3_vs_1 = c(mean = mean(effect_diff_3_vs_1), quantile(effect_diff_3_vs_1, c(0.025, 0.975)))
)
$mean_intercept
[1] 3.419549

$mean_diff_2_vs_1
    mean     2.5%    97.5% 
2.950688 2.872812 3.029410 

$mean_diff_3_vs_2
     mean      2.5%     97.5% 
-4.240333 -4.321665 -4.160104 

$mean_diff_3_vs_1
     mean      2.5%     97.5% 
-1.289646 -1.320918 -1.257698 

Now we can use these coefficients to transform to our simulated betas

h1m6_coeff <- transform_attempt(3.419549, 2.950688, -4.240333)
print(h1m6_coeff)
  b_attempt2 b_attempt3
1   1.862888  0.3343524

Now this looks closer again to our betas.

This code does the same for all predictors (except concept and participant)

# Extract posterior samples
posterior_samples <- as_draws_df(h1.m6)
alpha_samples <- posterior_samples$b_Intercept

# Create a list to store effects for each fixed factor
effect_list <- list()

# Helper function to calculate summary statistics
get_effect_summary <- function(effect_samples) {
  mean_effect <- mean(effect_samples)
  se_effect <- sd(effect_samples)
  ci_effect <- quantile(effect_samples, c(0.025, 0.975))
  post_prob <- mean(effect_samples > 0)
  c(mean = mean_effect, 
    se = se_effect, 
    lower_ci = ci_effect[1], 
    upper_ci = ci_effect[2], 
    post_prob = post_prob)
}

# COMMATT (successive differences coding)
if ("b_CommAtt2M1" %in% colnames(posterior_samples) & "b_CommAtt3M2" %in% colnames(posterior_samples)) {
  beta_2_vs_1 <- posterior_samples$b_CommAtt2M1
  beta_3_vs_2 <- posterior_samples$b_CommAtt3M2
  
  mu_1 <- alpha_samples
  mu_2 <- alpha_samples + beta_2_vs_1
  mu_3 <- alpha_samples + beta_2_vs_1 + beta_3_vs_2
  
  effect_list$CommAtt <- rbind(
    "commat 2 vs 1" = get_effect_summary(exp(mu_2) - exp(mu_1)),
    "commat 3 vs 2" = get_effect_summary(exp(mu_3) - exp(mu_2)),
    "commat 3 vs 1" = get_effect_summary(exp(mu_3) - exp(mu_1))
  )
}

# MODALITY (sum contrasts scaled by 0.5)
if ("b_Modality1" %in% colnames(posterior_samples) & "b_Modality2" %in% colnames(posterior_samples)) {
  beta_mod_1 <- posterior_samples$b_Modality1
  beta_mod_2 <- posterior_samples$b_Modality2
  
  mu_mod_1 <- alpha_samples + beta_mod_1
  mu_mod_2 <- alpha_samples + beta_mod_2
  mu_mod_3 <- alpha_samples - beta_mod_1 - beta_mod_2
  
  effect_list$Modality <- rbind(
    "mod 1 vs 2" = get_effect_summary(exp(mu_mod_1) - exp(mu_mod_2)),
    "mod 1 vs 3" = get_effect_summary(exp(mu_mod_1) - exp(mu_mod_3)),
    "mod 2 vs 3" = get_effect_summary(exp(mu_mod_2) - exp(mu_mod_3))
  )
}

# BIG5 (continuous)
if ("b_Big5" %in% colnames(posterior_samples)) {
  beta_big5 <- posterior_samples$b_Big5
  effect_list$Big5 <- get_effect_summary(exp(alpha_samples + beta_big5) - exp(alpha_samples))
}

# FAMILIARITY (continuous)
if ("b_Familiarity" %in% colnames(posterior_samples)) {
  beta_fam <- posterior_samples$b_Familiarity
  effect_list$Familiarity <- get_effect_summary(exp(alpha_samples + beta_fam) - exp(alpha_samples))
}

# EXPRESSIBILITY_Z (continuous)
if ("b_Expressibility_z" %in% colnames(posterior_samples)) {
  beta_expr <- posterior_samples$b_Expressibility_z
  effect_list$Expressibility_z <- get_effect_summary(exp(alpha_samples + beta_expr) - exp(alpha_samples))
}

# TRIAL NUMBER (centered continuous)
if ("b_TrialNumber_c" %in% colnames(posterior_samples)) {
  beta_trial <- posterior_samples$b_TrialNumber_c
  effect_list$TrialNumber_c <- get_effect_summary(exp(alpha_samples + beta_trial) - exp(alpha_samples))
}

# Convert to a nicely formatted data frame
effect_df <- do.call(rbind, effect_list)

# View effects
effect_df
                        mean         se lower_ci.2.5% upper_ci.97.5% post_prob
commat 2 vs 1     2.95068779 0.04052085    2.87281233      3.0294096   1.00000
commat 3 vs 2    -4.24033346 0.04140812   -4.32166524     -4.1601045   0.00000
commat 3 vs 1    -1.28964568 0.01620638   -1.32091801     -1.2576981   0.00000
mod 1 vs 2       -1.52303929 0.04030161   -1.60299691     -1.4435554   0.00000
mod 1 vs 3       -1.49131528 0.04098655   -1.57288685     -1.4107477   0.00000
mod 2 vs 3        0.03172401 0.04858764   -0.06380552      0.1252410   0.74175
Big5              1.31309529 0.04264827    1.23007768      1.3970338   1.00000
Familiarity       1.30972571 0.04408809    1.22420946      1.3983930   1.00000
Expressibility_z  0.44448503 0.01197175    0.42135330      0.4684496   1.00000

So here we also see the negative effect of combined modality (mod1)

Now we can also check some visual diagnostics.

plot(h1.m6)

# all good

plot(conditional_effects(h1.m6), points = TRUE)

# the effects all go in good direction

pp_check(h1.m6, type = "dens_overlay")

# we got rid of negative predictions, and it looks very good

pp_check(h1.m6, type = "error_scatter_avg")

# very nice blob

h1.m6_R2
    Estimate  Est.Error      Q2.5     Q97.5
R2 0.8889641 0.00137284 0.8861744 0.8915303
# explained variance increases to 88%

Model 6.1 - with correlation

Since we now significantly improved the model performance, let’s try once again the correlation between slope and intercept

h1.m6c <- brm(Eff ~ 1 + CommAtt + Modality + Big5 + Familiarity + Expressibility_z +  (1 + CommAtt | Participant) + (1 + CommAtt | Concept) + (1 | TrialNumber_c), 
                data = final_data,
                family = lognormal(),
                iter = 4000,
                cores = 4)


# Add criterions for later diagnostics
h1.m6c <- add_criterion(h1.m6c, criterion = c("loo", "waic"))

# Calculate also variance explained (R^2)
h1.m6c_R2 <- bayes_R2(h1.m6c)

# Save both as objects
saveRDS(h1.m6c, here("09_Analysis_Modeling", "models", "h1.m6c.rds"))
saveRDS(h1.m6c_R2, here("09_Analysis_Modeling", "models", "h1.m6c_R2.rds"))

beep(5)
h1.m6c <- readRDS(here("09_Analysis_Modeling", "models", "h1.m6c.rds"))
h1.m6c_R2 <- readRDS(here("09_Analysis_Modeling", "models", "h1.m6c_R2.rds"))


# Summary
summary(h1.m6c)
 Family: lognormal 
  Links: mu = identity; sigma = identity 
Formula: Eff ~ 1 + CommAtt + Modality + Big5 + Familiarity + Expressibility_z + (1 + CommAtt | Participant) + (1 + CommAtt | Concept) + (1 | TrialNumber_c) 
   Data: final_data (Number of observations: 5076) 
  Draws: 4 chains, each with iter = 4000; warmup = 2000; thin = 1;
         total post-warmup draws = 8000

Multilevel Hyperparameters:
~Concept (Number of levels: 21) 
                           Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
sd(Intercept)                  0.01      0.00     0.00     0.02 1.00     1744
sd(CommAtt2M1)                 0.01      0.00     0.00     0.02 1.00     5613
sd(CommAtt3M2)                 0.01      0.01     0.00     0.03 1.00     3784
cor(Intercept,CommAtt2M1)      0.28      0.49    -0.78     0.95 1.00    11362
cor(Intercept,CommAtt3M2)      0.33      0.47    -0.75     0.95 1.00     7628
cor(CommAtt2M1,CommAtt3M2)    -0.05      0.50    -0.89     0.87 1.00     8530
                           Tail_ESS
sd(Intercept)                  1448
sd(CommAtt2M1)                 5243
sd(CommAtt3M2)                 4533
cor(Intercept,CommAtt2M1)      5945
cor(Intercept,CommAtt3M2)      5887
cor(CommAtt2M1,CommAtt3M2)     6522

~Participant (Number of levels: 120) 
                           Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
sd(Intercept)                  0.05      0.00     0.04     0.06 1.00     3640
sd(CommAtt2M1)                 0.01      0.00     0.00     0.02 1.00     5290
sd(CommAtt3M2)                 0.01      0.01     0.00     0.03 1.00     3474
cor(Intercept,CommAtt2M1)      0.35      0.46    -0.73     0.95 1.00    10725
cor(Intercept,CommAtt3M2)     -0.40      0.40    -0.94     0.59 1.00     9703
cor(CommAtt2M1,CommAtt3M2)    -0.27      0.49    -0.94     0.78 1.00     5410
                           Tail_ESS
sd(Intercept)                  5384
sd(CommAtt2M1)                 4470
sd(CommAtt3M2)                 4069
cor(Intercept,CommAtt2M1)      5934
cor(Intercept,CommAtt3M2)      5524
cor(CommAtt2M1,CommAtt3M2)     6511

~TrialNumber_c (Number of levels: 50) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)     0.00      0.00     0.00     0.01 1.00     4362     4575

Regression Coefficients:
                 Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept            1.23      0.01     1.22     1.24 1.00     5205     5979
CommAtt2M1           0.62      0.01     0.61     0.63 1.00    14611     6124
CommAtt3M2          -1.10      0.01    -1.11    -1.08 1.00    12219     6336
Modality1           -0.32      0.01    -0.33    -0.30 1.00    14134     6778
Modality2            0.16      0.01     0.15     0.18 1.00    15159     6553
Big5                 0.32      0.01     0.31     0.34 1.00     4826     6410
Familiarity          0.32      0.01     0.30     0.34 1.00     4174     5246
Expressibility_z     0.12      0.00     0.12     0.13 1.00    13884     6876

Further Distributional Parameters:
      Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma     0.17      0.00     0.17     0.18 1.00    17858     5344

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
# Coefficients remain mostly unchanged 
plot(h1.m6c)

# now correlation does not seem to generate problems

plot(conditional_effects(h1.m6c), points = TRUE)

# the effects all go in good direction

pp_check(h1.m6c, type = "dens_overlay")

# ppcheck good

pp_check(h1.m6c, type = "error_scatter_avg")

# nice blob

h1.m6c_R2
    Estimate   Est.Error      Q2.5     Q97.5
R2 0.8890392 0.001393478 0.8861745 0.8916361
# explained variance remains around 88%

Diagnostics I

We now have several models that we can test for predictive performance.

Rhat

Rhat tells us whether the model’s Markov chains have converged—values close to 1 mean the model has likely converged well.

# Extract R-hat values for each model
rhat_list <- lapply(model_list, function(model) {
  rhat_values <- rhat(model)
  data.frame(model = deparse(substitute(model)), 
             max_rhat = max(rhat_values), 
             min_rhat = min(rhat_values))
})

# Combine and inspect
do.call(rbind, rhat_list)
   model max_rhat  min_rhat
1 X[[i]] 1.002349 0.9996758
2 X[[i]] 1.011516 0.9997215
3 X[[i]] 1.005103 0.9997167
4 X[[i]] 1.003503 0.9996573
5 X[[i]] 1.004119 0.9996640
6 X[[i]] 1.003712 0.9996386
7 X[[i]] 1.002459 0.9995773
8 X[[i]] 1.002605 0.9996426

All models seems actually ok in terms of Rhat values except model 2 (h1.m2)

ESS

Effective sample size tells how many independent samples the model has effectively drawn from the PD. Low ESS suggests autocorrelation (i.e., sample explores one part of posterior), while high ESS means good mix

# Extract n_eff values for each model
neff_ratio_list <- lapply(model_list, function(model) {
  neff_values <- neff_ratio(model)              # Here we calculate ratio (not the raw number of effective samples)
  data.frame(model = deparse(substitute(model)), 
             min_neff = min(neff_values), 
             max_neff = max(neff_values),
             mean_neff = mean(neff_values))
               
})

# Combine and inspect
do.call(rbind, neff_ratio_list)
   model   min_neff  max_neff mean_neff
1 X[[i]] 0.23283181 0.8832823 0.6862927
2 X[[i]] 0.04396011 0.9020581 0.6204512
3 X[[i]] 0.14481431 0.8834718 0.7504325
4 X[[i]] 0.22967606 0.9020381 0.7709286
5 X[[i]] 0.11769241 0.9065446 0.7505601
6 X[[i]] 0.20616401 0.9322803 0.7457049
7 X[[i]] 0.28072060 0.8984722 0.7626900
8 X[[i]] 0.18101711 0.9370772 0.7924409

So the highest ratio have model h1.m6c (lognormal with correlation) but in fact they are all quite comparable. Let’s loot at 3 highest

effective_sample(h1.m6c) # this one indeed seems the best as it has all ESS around 10k
           Parameter   ESS
1        b_Intercept  5193
2       b_CommAtt2M1 14438
3       b_CommAtt3M2 12340
4        b_Modality1 14518
5        b_Modality2 14927
6             b_Big5  4824
7      b_Familiarity  4162
8 b_Expressibility_z 14233
effective_sample(h1.m6)
           Parameter   ESS
1        b_Intercept  3104
2       b_CommAtt2M1  9618
3       b_CommAtt3M2  9509
4        b_Modality1  9057
5        b_Modality2  9192
6             b_Big5  2906
7      b_Familiarity  2280
8 b_Expressibility_z 10811
effective_sample(h1.m3p) 
           Parameter   ESS
1        b_Intercept  8533
2       b_CommAtt2M1  2795
3       b_CommAtt3M2  2385
4        b_Modality1 13744
5        b_Modality2 13144
6             b_Big5 12483
7      b_Familiarity 13139
8 b_Expressibility_z 15614

So there are some considerable differences for different coefficients

LOO & WAIC

Leave-One-Out (LOO) validation is a technique where, for each iteration, one data point is left out as the test set, and the model is trained on the remaining data points; this process is repeated for each data point, and the model’s overall performance is averaged over all iterations.

l <- loo_compare(h1.m1, h1.m2, h1.m3, h1.m3p, h1.m4, h1.m5, h1.m6, h1.m6c, criterion = "loo")

print(l, simplify = F)
       elpd_diff se_diff elpd_loo se_elpd_loo p_loo   se_p_loo looic   se_looic
h1.m6c     0.0       0.0 -5070.5     69.3       126.5     3.5  10141.0   138.5 
h1.m6     -0.2       1.5 -5070.7     69.2       120.4     3.3  10141.4   138.5 
h1.m5   -904.2      59.2 -5974.6     74.1       418.5     5.3  11949.3   148.1 
h1.m2  -1143.5      67.5 -6214.0     78.4       285.6    10.7  12428.0   156.8 
h1.m3  -1221.1      66.5 -6291.6     75.5       261.3     7.7  12583.2   150.9 
h1.m3p -1222.5      66.9 -6293.0     76.3       281.6     8.3  12586.0   152.5 
h1.m4  -1224.7      67.4 -6295.2     77.0       281.3     8.4  12590.3   154.0 
h1.m1  -1681.0      72.2 -6751.5     74.9        25.3     0.8  13503.0   149.9 
Note

elpd_loo: This is the expected log pointwise predictive density for LOO. Higher values indicate a better fit to the data.

se_elpd_loo: The standard error of the elpd_loo, representing uncertainty in the model’s predictive fit according to LOO.

looic: The LOO Information Criterion, which is similar to waic but based on leave-one-out cross-validation. Lower values are better.

p_loo: The effective number of parameters according to LOO, indicating the model’s complexity.

se_p_loo: The standard error of p_loo, representing uncertainty around the effective number of parameters.

Model with lognormal distribution seems the best as assessed by LOO.

w <- loo_compare(h1.m1, h1.m2, h1.m3, h1.m3p, h1.m4, h1.m5, h1.m6, h1.m6c, criterion = "waic")

print(w, simplify = F)
       elpd_diff se_diff elpd_waic se_elpd_waic p_waic  se_p_waic waic   
h1.m6c     0.0       0.0 -5070.3      69.3        126.3     3.5   10140.5
h1.m6     -0.3       1.5 -5070.5      69.2        120.2     3.3   10141.0
h1.m5   -903.7      59.2 -5974.0      74.1        417.8     5.3   11948.0
h1.m2  -1142.2      67.5 -6212.5      78.3        284.1    10.6   12424.9
h1.m3  -1220.4      66.5 -6290.7      75.4        260.3     7.6   12581.3
h1.m3p -1221.6      66.9 -6291.8      76.2        280.4     8.2   12583.7
h1.m4  -1223.8      67.3 -6294.1      77.0        280.3     8.4   12588.2
h1.m1  -1681.2      72.2 -6751.5      74.9         25.3     0.8   13502.9
       se_waic
h1.m6c   138.5
h1.m6    138.5
h1.m5    148.1
h1.m2    156.6
h1.m3    150.9
h1.m3p   152.5
h1.m4    153.9
h1.m1    149.9
# see Solomon Kurz
cbind(waic_diff = w[,1] * -2,
      se = w[,2] * 2)
          waic_diff         se
h1.m6c    0.0000000   0.000000
h1.m6     0.5289111   2.917133
h1.m5  1807.4580148 118.471557
h1.m2  2284.3843630 134.987550
h1.m3  2440.8062251 132.903242
h1.m3p 2443.1522534 133.864984
h1.m4  2447.6767321 134.678147
h1.m1  3362.3866702 144.475730
Note

elpd_waic (expected log pointwise predictive density for WAIC): This represents the model’s predictive fit to the data. Higher values indicate a better fit.

se_elpd_waic (standard error of elpd_waic): Measures uncertainty around the elpd_waic estimate.

waic: The Widely Applicable Information Criterion, a measure of model fit where lower values indicate a better fit.

se_waic (standard error of WAIC): Uncertainty around the WAIC estimate.

elpd_diff: The difference in the elpd_waic between the model in question and the baseline model (fit_eff_2, which has elpd_diff of 0). A negative value indicates that the model fits worse than fit_eff_2.

se_diff: The standard error of the elpd_diff, indicating how much uncertainty there is in the difference in predictive performance.

p_waic: The number of effective parameters in the model (related to model complexity). Lower values indicate simpler models, and higher values suggest more complexity.

Plot the comparison

model_weights(h1.m1, h1.m2, h1.m3, h1.m3p, h1.m4, h1.m5, h1.m6, h1.m6c, weights = "waic") %>% 
  round(digits = 2)
 h1.m1  h1.m2  h1.m3 h1.m3p  h1.m4  h1.m5  h1.m6 h1.m6c 
  0.00   0.00   0.00   0.00   0.00   0.00   0.43   0.57 

So as ppcheck already suggested, lognormal model indeed seem to have the most predictive power. For this particular (synthetic) data, we will now proceed with model h1.m6c

We will first add some mildly informative priors, and then we also try to add some interaction terms and do a comparison once again.

Model 7 - lognormal with priors

Let’s first check what priors have been selected as default for h1.m6c

# Print priors
prior_summary(h1.m6c)
                  prior     class             coef         group resp dpar
                 (flat)         b                                         
                 (flat)         b             Big5                        
                 (flat)         b       CommAtt2M1                        
                 (flat)         b       CommAtt3M2                        
                 (flat)         b Expressibility_z                        
                 (flat)         b      Familiarity                        
                 (flat)         b        Modality1                        
                 (flat)         b        Modality2                        
 student_t(3, 1.3, 2.5) Intercept                                         
   lkj_corr_cholesky(1)         L                                         
   lkj_corr_cholesky(1)         L                        Concept          
   lkj_corr_cholesky(1)         L                    Participant          
   student_t(3, 0, 2.5)        sd                                         
   student_t(3, 0, 2.5)        sd                        Concept          
   student_t(3, 0, 2.5)        sd       CommAtt2M1       Concept          
   student_t(3, 0, 2.5)        sd       CommAtt3M2       Concept          
   student_t(3, 0, 2.5)        sd        Intercept       Concept          
   student_t(3, 0, 2.5)        sd                    Participant          
   student_t(3, 0, 2.5)        sd       CommAtt2M1   Participant          
   student_t(3, 0, 2.5)        sd       CommAtt3M2   Participant          
   student_t(3, 0, 2.5)        sd        Intercept   Participant          
   student_t(3, 0, 2.5)        sd                  TrialNumber_c          
   student_t(3, 0, 2.5)        sd        Intercept TrialNumber_c          
   student_t(3, 0, 2.5)     sigma                                         
 nlpar lb ub       source
                  default
             (vectorized)
             (vectorized)
             (vectorized)
             (vectorized)
             (vectorized)
             (vectorized)
             (vectorized)
                  default
                  default
             (vectorized)
             (vectorized)
        0         default
        0    (vectorized)
        0    (vectorized)
        0    (vectorized)
        0    (vectorized)
        0    (vectorized)
        0    (vectorized)
        0    (vectorized)
        0    (vectorized)
        0    (vectorized)
        0    (vectorized)
        0         default

Wee can keep all defaulted ones, but we do not need to leave flat priors for the beta coefficients as we do have some assumptions/expectations

Let’s reuse priors we have already used for h1.m3p

priors_h1m7 <- c(
  set_prior("normal(2.5, 0.5)", class = "Intercept", lb=0),
  set_prior("normal(0,0.50)", class = "b", coef = "CommAtt2M1"),
  set_prior("normal(0,0.50)", class = "b", coef = "CommAtt3M2"),
  set_prior("normal(0,0.25)", class = "b", coef = "Modality1"),
  set_prior("normal(0,0.25)", class = "b", coef = "Modality2"),
  set_prior("normal(0,0.25)", class = "b", coef = "Big5"),
  set_prior("normal(0,0.25)", class = "b", coef = "Familiarity"),
  set_prior("normal(0,0.25)", class = "b", coef = "Expressibility_z")
)

# The rest we will leave default (and check afterwards)
h1.m7 <- brm(Eff ~ 1 + CommAtt + Modality + Big5 + Familiarity + Expressibility_z +  (1 + CommAtt | Participant) + (1 + CommAtt | Concept) + (1 | TrialNumber_c), 
                data = final_data,
                family = lognormal(),
                prior = priors_h1m7,
                iter = 4000,
                cores = 4)


# Add criterions for later diagnostics
h1.m7 <- add_criterion(h1.m7, criterion = c("loo", "waic"))

# Calculate also variance explained (R^2)
h1.m7_R2 <- bayes_R2(h1.m7)

# Save both as objects
saveRDS(h1.m7, here("09_Analysis_Modeling", "models", "h1.m7.rds"))
saveRDS(h1.m7_R2, here("09_Analysis_Modeling", "models", "h1.m7_R2.rds"))

beep(5)
h1.m7 <- readRDS(here("09_Analysis_Modeling", "models", "h1.m7.rds"))
h1.m7_R2 <- readRDS(here("09_Analysis_Modeling", "models", "h1.m7_R2.rds"))

# Summary
summary(h1.m7)
 Family: lognormal 
  Links: mu = identity; sigma = identity 
Formula: Eff ~ 1 + CommAtt + Modality + Big5 + Familiarity + Expressibility_z + (1 + CommAtt | Participant) + (1 + CommAtt | Concept) + (1 | TrialNumber_c) 
   Data: final_data (Number of observations: 5076) 
  Draws: 4 chains, each with iter = 4000; warmup = 2000; thin = 1;
         total post-warmup draws = 8000

Multilevel Hyperparameters:
~Concept (Number of levels: 21) 
                           Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
sd(Intercept)                  0.01      0.00     0.00     0.02 1.00     2406
sd(CommAtt2M1)                 0.01      0.00     0.00     0.02 1.00     5336
sd(CommAtt3M2)                 0.01      0.01     0.00     0.03 1.00     3663
cor(Intercept,CommAtt2M1)      0.27      0.49    -0.79     0.95 1.00     9049
cor(Intercept,CommAtt3M2)      0.34      0.47    -0.74     0.95 1.00     7153
cor(CommAtt2M1,CommAtt3M2)    -0.05      0.50    -0.89     0.88 1.00     8030
                           Tail_ESS
sd(Intercept)                  2582
sd(CommAtt2M1)                 4860
sd(CommAtt3M2)                 4212
cor(Intercept,CommAtt2M1)      5794
cor(Intercept,CommAtt3M2)      6108
cor(CommAtt2M1,CommAtt3M2)     5937

~Participant (Number of levels: 120) 
                           Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
sd(Intercept)                  0.05      0.00     0.04     0.06 1.00     3562
sd(CommAtt2M1)                 0.01      0.00     0.00     0.02 1.00     5277
sd(CommAtt3M2)                 0.01      0.01     0.00     0.03 1.00     3118
cor(Intercept,CommAtt2M1)      0.34      0.46    -0.72     0.95 1.00    11493
cor(Intercept,CommAtt3M2)     -0.40      0.40    -0.94     0.57 1.00     7854
cor(CommAtt2M1,CommAtt3M2)    -0.26      0.48    -0.94     0.77 1.00     5771
                           Tail_ESS
sd(Intercept)                  5189
sd(CommAtt2M1)                 4743
sd(CommAtt3M2)                 3237
cor(Intercept,CommAtt2M1)      6204
cor(Intercept,CommAtt3M2)      5447
cor(CommAtt2M1,CommAtt3M2)     6837

~TrialNumber_c (Number of levels: 50) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)     0.00      0.00     0.00     0.01 1.00     4640     4109

Regression Coefficients:
                 Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept            1.23      0.01     1.22     1.24 1.00     5166     5351
CommAtt2M1           0.62      0.01     0.61     0.63 1.00    12440     6165
CommAtt3M2          -1.10      0.01    -1.11    -1.08 1.00    10543     6212
Modality1           -0.31      0.01    -0.33    -0.30 1.00    13106     5622
Modality2            0.16      0.01     0.15     0.17 1.00    13551     6493
Big5                 0.32      0.01     0.31     0.34 1.00     4012     4552
Familiarity          0.32      0.01     0.30     0.34 1.00     4337     5552
Expressibility_z     0.12      0.00     0.12     0.13 1.00    11948     6505

Further Distributional Parameters:
      Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma     0.17      0.00     0.17     0.18 1.00    15087     5361

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
# Coefficients remain mostly unchanged 
plot(h1.m7)

# all good

plot(conditional_effects(h1.m7), points = TRUE)

# the effects all go in good direction

pp_check(h1.m7, type = "dens_overlay")

# nice

pp_check(h1.m7, type = "error_scatter_avg")

# unchanged

h1.m7_R2
    Estimate   Est.Error      Q2.5     Q97.5
R2 0.8889672 0.001399296 0.8861732 0.8916425
# explained variance remains around 88%

Let’s now also check whether the priors make sense

Model 8 - adding interactions

Possible interactions include:

  • CommAtt x Modality - Effort could increase differently across modalities, depending on whether concept is guessed on the first, second or third attempt. E.g., gesture might require more effort on initial attempt, but vocal require more effort in repeated attempt. This is interesting question and it would help disentangle the benfit of multimodality over unimodality (i.e., multimodal trials might be more effortful, but overal have less attempts). However, we already model effect of modality on effort, so maybe this is not top priority.

  • CommAtt x Expressibility - Higher expressibility should moderate the effect of repeated attempts, such that the increase in effort with each additional attempt is smaller (or bigger?) for more expressible concepts.This is not a priority for our analysis.

  • Modality × Expressibility_z - The influence of expressibility on effort could be modality-specific — perhaps effort increases less with expressibility in the combined modality. For our analysis, this is also not a priority (especially since expressibiliy has already modality encoded)

  • Familiarity x CommAtt - More familiar partners may guess faster (fewer attempts) and require less effort, but this effect could diminish over multiple attempts. For our analysis, this is not a priority,

  • Big5 × Modality or Big5 × CommAtt - More open/extraverted participants might maintain higher effort over attempts, or adjust more dynamically depending on the communicative channel. While not priority, it is an interesting question if we want to tap more into the inter-individual variability.

Let’s try CommAtt x Modality and Big5 x CommAtt

h1.m8 <- brm(Eff ~ 1 + CommAtt * Modality + Big5 * CommAtt + Familiarity + Expressibility_z +  (1 + CommAtt | Participant) + (1 + CommAtt | Concept) + (1 | TrialNumber_c), 
                data = final_data,
                family = lognormal(),
                prior = priors_h1m7, # we keep the priors from previous model
                iter = 4000,
                cores = 4)


# Add criterions for later diagnostics
h1.m8 <- add_criterion(h1.m8, criterion = c("loo", "waic"))

# Calculate also variance explained (R^2)
h1.m8_R2 <- bayes_R2(h1.m8)

# Save both as objects
saveRDS(h1.m8, here("09_Analysis_Modeling", "models", "h1.m8.rds"))
saveRDS(h1.m8_R2, here("09_Analysis_Modeling", "models", "h1.m8_R2.rds"))

beep(5)
h1.m8 <- readRDS(here("09_Analysis_Modeling", "models", "h1.m8.rds"))
h1.m8_R2 <- readRDS(here("09_Analysis_Modeling", "models", "h1.m8_R2.rds"))


# Summary
summary(h1.m8)
 Family: lognormal 
  Links: mu = identity; sigma = identity 
Formula: Eff ~ 1 + CommAtt * Modality + Big5 * CommAtt + Familiarity + Expressibility_z + (1 + CommAtt | Participant) + (1 + CommAtt | Concept) + (1 | TrialNumber_c) 
   Data: final_data (Number of observations: 5076) 
  Draws: 4 chains, each with iter = 4000; warmup = 2000; thin = 1;
         total post-warmup draws = 8000

Multilevel Hyperparameters:
~Concept (Number of levels: 21) 
                           Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
sd(Intercept)                  0.01      0.00     0.00     0.02 1.00     2549
sd(CommAtt2M1)                 0.01      0.00     0.00     0.02 1.00     5836
sd(CommAtt3M2)                 0.01      0.01     0.00     0.03 1.00     4322
cor(Intercept,CommAtt2M1)      0.29      0.48    -0.76     0.94 1.00    10754
cor(Intercept,CommAtt3M2)      0.36      0.46    -0.70     0.95 1.00     8535
cor(CommAtt2M1,CommAtt3M2)    -0.04      0.50    -0.89     0.86 1.00     7971
                           Tail_ESS
sd(Intercept)                  2905
sd(CommAtt2M1)                 4937
sd(CommAtt3M2)                 4495
cor(Intercept,CommAtt2M1)      5912
cor(Intercept,CommAtt3M2)      6615
cor(CommAtt2M1,CommAtt3M2)     6944

~Participant (Number of levels: 120) 
                           Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
sd(Intercept)                  0.05      0.00     0.04     0.06 1.00     3682
sd(CommAtt2M1)                 0.01      0.00     0.00     0.02 1.00     5068
sd(CommAtt3M2)                 0.01      0.01     0.00     0.03 1.00     3392
cor(Intercept,CommAtt2M1)      0.34      0.45    -0.71     0.94 1.00    12364
cor(Intercept,CommAtt3M2)     -0.41      0.39    -0.94     0.57 1.00     9996
cor(CommAtt2M1,CommAtt3M2)    -0.26      0.49    -0.95     0.78 1.00     5862
                           Tail_ESS
sd(Intercept)                  5433
sd(CommAtt2M1)                 5479
sd(CommAtt3M2)                 3982
cor(Intercept,CommAtt2M1)      5945
cor(Intercept,CommAtt3M2)      6038
cor(CommAtt2M1,CommAtt3M2)     6673

~TrialNumber_c (Number of levels: 50) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)     0.00      0.00     0.00     0.01 1.00     4900     4106

Regression Coefficients:
                     Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
Intercept                1.23      0.01     1.22     1.24 1.00     5217
CommAtt2M1               0.62      0.01     0.61     0.63 1.00    14494
CommAtt3M2              -1.10      0.01    -1.11    -1.08 1.00    13959
Modality1               -0.31      0.01    -0.33    -0.30 1.00    12978
Modality2                0.16      0.01     0.15     0.18 1.00    13925
Big5                     0.32      0.01     0.30     0.34 1.00     4506
Familiarity              0.32      0.01     0.30     0.34 1.00     3885
Expressibility_z         0.12      0.00     0.12     0.13 1.00    12340
CommAtt2M1:Modality1     0.01      0.02    -0.02     0.04 1.00    13065
CommAtt3M2:Modality1     0.01      0.02    -0.03     0.05 1.00    12851
CommAtt2M1:Modality2    -0.01      0.02    -0.04     0.02 1.00    13667
CommAtt3M2:Modality2     0.02      0.02    -0.02     0.06 1.00    12109
CommAtt2M1:Big5         -0.00      0.01    -0.02     0.02 1.00    14725
CommAtt3M2:Big5         -0.01      0.01    -0.04     0.01 1.00    14109
                     Tail_ESS
Intercept                6099
CommAtt2M1               7032
CommAtt3M2               6248
Modality1                6662
Modality2                6627
Big5                     5386
Familiarity              5579
Expressibility_z         6403
CommAtt2M1:Modality1     6974
CommAtt3M2:Modality1     6884
CommAtt2M1:Modality2     6960
CommAtt3M2:Modality2     6404
CommAtt2M1:Big5          6229
CommAtt3M2:Big5          6504

Further Distributional Parameters:
      Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma     0.17      0.00     0.17     0.18 1.00    17574     5831

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
# Coefficients remain mostly unchanged 
# Since we did not really focused on the interactions during the simulation, we also don't have strong expectations here. But for the real data, there is a good reason to expect some meaningful values
plot(h1.m8)

# all good

plot(conditional_effects(h1.m8), points = TRUE)

# the effects all go in good direction
# we can see here that combined modality remains moderated across all commatts
# and also big5 seems to matter the most in the second attempt

pp_check(h1.m8, type = "dens_overlay")

# the problem with predicting negative values remains

pp_check(h1.m8, type = "error_scatter_avg")

# unchanged

h1.m8_R2
    Estimate   Est.Error      Q2.5     Q97.5
R2 0.8889759 0.001459646 0.8859485 0.8916926
# explained variance remains around 87%

Diagnostics II

ESS

effective_sample(h1.m6) 
           Parameter   ESS
1        b_Intercept  3104
2       b_CommAtt2M1  9618
3       b_CommAtt3M2  9509
4        b_Modality1  9057
5        b_Modality2  9192
6             b_Big5  2906
7      b_Familiarity  2280
8 b_Expressibility_z 10811
effective_sample(h1.m6c)
           Parameter   ESS
1        b_Intercept  5193
2       b_CommAtt2M1 14438
3       b_CommAtt3M2 12340
4        b_Modality1 14518
5        b_Modality2 14927
6             b_Big5  4824
7      b_Familiarity  4162
8 b_Expressibility_z 14233
effective_sample(h1.m7) 
           Parameter   ESS
1        b_Intercept  5149
2       b_CommAtt2M1 12363
3       b_CommAtt3M2 10483
4        b_Modality1 13095
5        b_Modality2 13808
6             b_Big5  3976
7      b_Familiarity  4298
8 b_Expressibility_z 11917
effective_sample(h1.m8) 
                Parameter   ESS
1             b_Intercept  5221
2            b_CommAtt2M1 14165
3            b_CommAtt3M2 13961
4             b_Modality1 12872
5             b_Modality2 13799
6                  b_Big5  4484
7           b_Familiarity  3878
8      b_Expressibility_z 12296
9  b_CommAtt2M1:Modality1 13103
10 b_CommAtt3M2:Modality1 12880
11 b_CommAtt2M1:Modality2 13651
12 b_CommAtt3M2:Modality2 12303
13      b_CommAtt2M1:Big5 15040
14      b_CommAtt3M2:Big5 13963
# Now h1m6 looks as the weakest, while h1m8 looks much better - but ESS for Intercept is for some reason still quite low

LOO & WAIC

l <- loo_compare(h1.m6, h1.m6c, h1.m7, h1.m8, criterion = "loo")

print(l, simplify = F)
       elpd_diff se_diff elpd_loo se_elpd_loo p_loo   se_p_loo looic   se_looic
h1.m6c     0.0       0.0 -5070.5     69.3       126.5     3.5  10141.0   138.5 
h1.m6     -0.2       1.5 -5070.7     69.2       120.4     3.3  10141.4   138.5 
h1.m7     -0.3       0.2 -5070.8     69.3       126.8     3.5  10141.5   138.6 
h1.m8     -4.4       1.9 -5074.8     69.2       132.9     3.7  10149.7   138.5 
w <- loo_compare(h1.m6, h1.m6c, h1.m7, h1.m8, criterion = "waic")

print(w, simplify = F)
       elpd_diff se_diff elpd_waic se_elpd_waic p_waic  se_p_waic waic   
h1.m6c     0.0       0.0 -5070.3      69.3        126.3     3.5   10140.5
h1.m6     -0.3       1.5 -5070.5      69.2        120.2     3.3   10141.0
h1.m7     -0.3       0.2 -5070.5      69.3        126.5     3.5   10141.1
h1.m8     -4.4       1.9 -5074.6      69.2        132.7     3.7   10149.2
       se_waic
h1.m6c   138.5
h1.m6    138.5
h1.m7    138.6
h1.m8    138.5
# see Solomon Kurz
cbind(waic_diff = w[,1] * -2,
      se = w[,2] * 2)
       waic_diff        se
h1.m6c 0.0000000 0.0000000
h1.m6  0.5289111 2.9171328
h1.m7  0.5582963 0.3803033
h1.m8  8.7265395 3.7869214

So in terms of WAIC & LOO, interactions do not really add predictive power. This might be specific for the synthetic data, as we did not explicitly focused on the interaction coefficients there. At the same time, the difference of h1.m8 from h1.m6c is not so significant. For now, we stop here. With the full data, we will proceed with the same evaluation.

H2: Modelling

Now we can need to account also for our second hypothesis, namely

H2: A higher degree of misunderstanding will require a performer to engage in more effortful correction.

We operationalize degree of misunderstanding as a cosine similarity of performer’s target and guesser’s answer (PrevAn). The difficulty with our data structure is that PrevAn is a variable that has values only for second and third correction (i.e., there is no previous answer for the first performance). If we left it unchange, the model will eventually get rid of all NA values, which means we will loose the relationship between effort in second and first communicative attempt. To avoid this, we will create a new variable which we call Effort Change. We will simply calculate the change from effort in communicative attempt x to communicative attempt x+1. We will still loose communicative attempt 1, but the change that will be associated with CommAtt==2 already encapsulates the relationship towards this attempt.

Data wrangling I

# Calculate Effort Change (Difference)
final_data_2 <- final_data %>%
  group_by(Participant, Concept) %>%
  mutate(
    Effort_1 = Eff[CommAtt == 1][1],  # Effort for attempt 1
    Effort_2 = Eff[CommAtt == 2][1],  # Effort for attempt 2
    Effort_3 = Eff[CommAtt == 3][1],  # Effort for attempt 3
    Effort_Change_1_to_2 = case_when(
      CommAtt == 2 & !is.na(Effort_1) ~ Eff - Effort_1,  # Change from 1st to 2nd attempt
      TRUE ~ NA_real_
    ),
    Effort_Change_2_to_3 = case_when(
      CommAtt == 3 & !is.na(Effort_2) ~ Eff - Effort_2,  # Change from 2nd to 3rd attempt
      TRUE ~ NA_real_
    )
  ) %>%
  ungroup()

# Collide changes into a single column
final_data_2 <- final_data_2 %>%
  mutate(
    Effort_Change = coalesce(Effort_Change_1_to_2, Effort_Change_2_to_3)
  ) 

# Remove unnecessary columns
final_data_2 <- subset(final_data_2, select = -c(Effort_1, Effort_2, Effort_3, Effort_Change_1_to_2, Effort_Change_2_to_3)) 

# View the result
head(final_data_2, n=15)
# A tibble: 15 × 11
   Participant Concept Modality  Big5 Familiarity Expressibility CommAtt   Eff
         <dbl>   <dbl> <chr>    <dbl>       <dbl>          <dbl>   <dbl> <dbl>
 1           1       1 combined  1.92        1.84        0.576         1  4.34
 2           1       1 combined  1.92        1.84        0.576         2  6.57
 3           1       1 combined  1.92        1.84        0.576         3  2.37
 4           1       2 gesture   1.92        1.84        0.349         1  5.50
 5           1       2 gesture   1.92        1.84        0.349         2 10.4 
 6           1       2 gesture   1.92        1.84        0.349         3  4.00
 7           1       3 vocal     1.92        1.84        0.340         1  6.15
 8           1       3 vocal     1.92        1.84        0.340         2 12.9 
 9           1       3 vocal     1.92        1.84        0.340         3  3.59
10           1       4 combined  1.92        1.84        0.719         1  4.09
11           1       4 combined  1.92        1.84        0.719         2  6.41
12           1       5 gesture   1.92        1.84        0.0781        1  5.28
13           1       6 gesture   1.92        1.84        0.00419       1  5.61
14           1       6 gesture   1.92        1.84        0.00419       2 12.0 
15           1       6 gesture   1.92        1.84        0.00419       3  3.44
# ℹ 3 more variables: TrialNumber <dbl>, PrevAn <dbl>, Effort_Change <dbl>

Exploring structure

This is the relationship between effort and answer similarity (H2) as seen in the raw (synthetic) data

This is how effort change is distributed

So we will have to work with bimodal distribution, as we either have a decrease in effort or increase

H2: Stating causal model

We now also need a new DAG. Essentially, what we said will influence CommAtt in H1, will now also influence PrevAn because they are tightly related. For instance, more extroverted people can be expected to be better guessers, therefore the similarity of the previous answer will be higher.

dagitty::adjustmentSets(daggy_h2, exposure = "PrevAn", outcome = "EffChange")
{ Big5, Conc, Expr, Fam, Mod, Pcn, TrNum }

The adjustment set is identical to the one of H1. Note that we are here omitting arrows going from all these variables to ComAtt. Since PrevAn affects CommAtt, and not the other way, we do not need to block this path to avoid confounds. However, if we want to asses direct effect of PrevAn on EffChange, we have to add CommAtt to the model as well.

H2: Modelling

Converting columns to factors

filtered_data$CommAtt <- as.factor(filtered_data$CommAtt)
filtered_data$CommAtt <- droplevels(filtered_data$CommAtt) # commatt1 still remains as a level
filtered_data$CommAtt <- as.factor(filtered_data$CommAtt) # redo again
filtered_data$Modality <- as.factor(filtered_data$Modality)
filtered_data$Participant <- as.factor(filtered_data$Participant)
filtered_data$Concept <- as.factor(filtered_data$Concept)

filtered_data$TrialNumber <- as.numeric(filtered_data$TrialNumber)  # Ensure TrialNumber is numeric

Contrast-coding for categorical predictors

contrasts(filtered_data$CommAtt) <- contr.sum(2)/1
contrasts(filtered_data$Modality) <- contr.sum(3)/2

Centering continuous predictors

filtered_data$TrialNumber_c <- filtered_data$TrialNumber - median(range(filtered_data$TrialNumber))
filtered_data$Familiarity <- filtered_data$Familiarity - median(range(filtered_data$Familiarity))
filtered_data$Big5 <- filtered_data$Big5 - median(range(filtered_data$Big5))

Z-scoring expressibility and similarity (PrevAn)

filtered_data <-
  filtered_data |>
  group_by(Modality) |>
  mutate(Expressibility_z = (Expressibility - mean(Expressibility))/ sd(filtered_data$Expressibility, na.rm = T)) |>
  ungroup()

filtered_data <-
  filtered_data |>
  mutate(PrevAn_z = (PrevAn - mean(PrevAn))/ sd(filtered_data$PrevAn, na.rm = T)) |>
  ungroup()

Note: maybe we should center where we z-score to get better sense of the predictors (right now, for me it’s still a bit uninterpretable)

Model 1 - DAG

Similarly to H1, we will start with reproducing the DAG for H2

h2.m1 <- brm(Effort_Change ~ 1 + PrevAn_z + CommAtt + Familiarity + Big5 + Expressibility_z + TrialNumber_c + Modality + (1 | Participant) + (1 | Concept),
                data = filtered_data,
                iter = 4000,
                cores = 4)

# Add criterions for later diagnostics
h2.m1 <- add_criterion(h2.m1, criterion = c("loo", "waic"))

# Calculate also variance explained (R^2)
h2.m1_R2 <- bayes_R2(h2.m1)

# Save both as objects
saveRDS(h2.m1, here("09_Analysis_Modeling", "models", "h2.m1.rds"))
saveRDS(h2.m1_R2, here("09_Analysis_Modeling", "models", "h2.m1_R2.rds"))

beep(5)
h2.m1 <- readRDS(here("09_Analysis_Modeling", "models", "h2.m1.rds"))
h2.m1_R2 <- readRDS(here("09_Analysis_Modeling", "models", "h2.m1_R2.rds"))


# Summary
summary(h2.m1)
 Family: gaussian 
  Links: mu = identity; sigma = identity 
Formula: Effort_Change ~ 1 + PrevAn_z + CommAtt + Familiarity + Big5 + Expressibility_z + TrialNumber_c + Modality + (1 | Participant) + (1 | Concept) 
   Data: filtered_data (Number of observations: 2556) 
  Draws: 4 chains, each with iter = 4000; warmup = 2000; thin = 1;
         total post-warmup draws = 8000

Multilevel Hyperparameters:
~Concept (Number of levels: 21) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)     0.03      0.02     0.00     0.09 1.00     5564     4490

~Participant (Number of levels: 120) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)     0.02      0.02     0.00     0.07 1.00     6328     4353

Regression Coefficients:
                 Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept           -0.66      0.03    -0.71    -0.60 1.00    13909     5911
PrevAn_z            -0.62      0.02    -0.66    -0.57 1.00    17850     5508
CommAtt1             3.73      0.03     3.68     3.79 1.00    19981     5855
Familiarity          0.15      0.05     0.06     0.24 1.00    16619     5695
Big5                 0.14      0.04     0.06     0.23 1.00    16358     5795
Expressibility_z     0.08      0.03     0.02     0.13 1.00    17823     5427
TrialNumber_c        0.00      0.00    -0.00     0.01 1.00    11346     5008
Modality1           -0.14      0.07    -0.28     0.00 1.00    13812     6660
Modality2            0.01      0.07    -0.12     0.15 1.00    11997     6265

Further Distributional Parameters:
      Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma     1.25      0.02     1.21     1.29 1.00    19597     4922

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
# coefficients seem quite conservative, even for PrevAn (but remember, it's z-scored)
plot(h2.m1)

# all looks ok

plot(conditional_effects(h2.m1), points = TRUE)

# the main effect seems ok, the rest of the predictors does not show nothing (maybe because of the transformation we lost the relations between them and the effort)

pp_check(h2.m1, type = "dens_overlay")

# this looks quite okay-ish

pp_check(h2.m1, type = "error_scatter_avg")

# so there seems to be quite high residual error

h2.m1_R2
    Estimate   Est.Error      Q2.5     Q97.5
R2 0.8921087 0.001381664 0.8892535 0.8946349
# explained variance around 89%

We see two modes in the distribution of the predictor. While Gaussian family seems to deal with it well, we could also consider model with mixture of two Gaussian distributions. This will allow each mode to have their own mean and variance. With effort change being distributed over positive as well as negative values, lognormal distribution is not applicable.

Importantly, there are some diagnostics we should be cautious about - the relationship between predicted values and residual error is quite correlated.

Also notice the high explained variance indicated by R^2. In real case, this would be almost pointing to over-fitting issues. However, we created synthetic data that does not contain as much noise as we can expect from the real data, so it is quite likely that if we simply used all predictors we know to cause a variance here, we might be explaining most of the variation. It is quite likely this will not be the case for real data.

Splines 1 - fitting b-splines

In the previous model for H1, we were able to spot possible non-linear trends by contrast coding of the communicative attempts whereby each level was defined relative to the two remaining. Now, we are dealing with situation where both outcome variable and the main predictor are continuous variable. Also here we might expect nonlinear patterns in their relationship. For instance, it might be the case that for answers that are completely off, people do not feel motivated to put in more effort, and some kind of sensible ‘threshold’ needs to be reached for the effort to increase to resolve misunderstanding, and then again decrease when the answer is very similar.

We will use a combination of B-splines and Bayesian GAMs to prepare some models and try them fit to the synthetic data. However, note that we currently do not expect any non-linearity as we have not coded any within the simulation, as seen in the following plot fitted with loess method.

B-splines allow to build up wiggly functions from more simple components which are called ‘basis functions’ (B). They divide a full range of predictor into parts and assign a parameter to each part. The parameters are gradually turned on and off in a way that makes their sum into a curve. Unlike polynomial regression, b-splines do not transform predictor, but they invent a series of new synthetic predictor variables. Each of them then exists only to gradually turn a specific parameter on and off within a specific range of the real predictor variable. Each of these synthetic variables is called a basis function B. See more in Richard McElreath’s Statistical Rethinking (McElreath 2018).

We are using the code from Solomon Kurz’s adaptation of this book.

# Get rid of NAs in the predictor
d <-
  final_data_2 %>% 
  drop_na(PrevAn)

# And convert all that is necessary to factor/numerical
d$CommAtt <- as.factor(d$CommAtt)
d$Modality <- as.factor(d$Modality)
d$Participant <- as.factor(d$Participant)
d$Concept <- as.factor(d$Concept)
d$TrialNumber <- as.numeric(d$TrialNumber) 

Here we can see summary of each predictor

d %>% 
  select_if(is.numeric) %>%  # Select only numeric columns
  pivot_longer(cols = everything(), names_to = "key", values_to = "value") %>%
  group_by(key) %>%
  summarise(
    mean = mean(value, na.rm = TRUE),
    sd   = sd(value, na.rm = TRUE),
    ll   = quantile(value, probs = 0.055, na.rm = TRUE),
    ul   = quantile(value, probs = 0.945, na.rm = TRUE)
  ) %>%
  mutate(across(where(is.double), round, digits = 2))
# A tibble: 7 × 5
  key             mean    sd    ll    ul
  <chr>          <dbl> <dbl> <dbl> <dbl>
1 Big5            0.98  0.57  0.09  1.93
2 Eff             5.1   2.79  1.43  9.79
3 Effort_Change   0.54  3.8  -5.92  4.99
4 Expressibility  0.47  0.38  0.07  1.32
5 Familiarity     1.13  0.55  0.15  1.9 
6 PrevAn          0.5   0.29  0.05  0.94
7 TrialNumber    22.6  12.4   3    42   
head(d, n=10)
# A tibble: 10 × 11
   Participant Concept Modality  Big5 Familiarity Expressibility CommAtt   Eff
   <fct>       <fct>   <fct>    <dbl>       <dbl>          <dbl> <fct>   <dbl>
 1 1           1       combined  1.92        1.84        0.576   2        6.57
 2 1           1       combined  1.92        1.84        0.576   3        2.37
 3 1           2       gesture   1.92        1.84        0.349   2       10.4 
 4 1           2       gesture   1.92        1.84        0.349   3        4.00
 5 1           3       vocal     1.92        1.84        0.340   2       12.9 
 6 1           3       vocal     1.92        1.84        0.340   3        3.59
 7 1           4       combined  1.92        1.84        0.719   2        6.41
 8 1           6       gesture   1.92        1.84        0.00419 2       12.0 
 9 1           6       gesture   1.92        1.84        0.00419 3        3.44
10 1           8       vocal     1.92        1.84        0.591   2        9.11
# ℹ 3 more variables: TrialNumber <dbl>, PrevAn <dbl>, Effort_Change <dbl>

Now we need to specify knots that function as pivots for number of different basis functions. The B variable then tells you which knot you are close to.

num_knots <- 7
knot_list <- quantile(d$PrevAn, probs = seq(from = 0, to = 1, length.out = num_knots))
knot_list
          0%    16.66667%    33.33333%          50%    66.66667%    83.33333% 
0.0001020494 0.1540585973 0.3319149296 0.5038721841 0.6719870094 0.8383128390 
        100% 
0.9989464423 

Now we can see how we chopped the data by the knots

Now we need to specify the polynomial degree which determines how parameters interact to produce the spline.

For degree 1, two basis functions combine at each point. For degree 2, three functions combine at each point. For degree 3, four combine.

B <- bs(d$PrevAn,
        knots = knot_list[-c(1, num_knots)], 
        degree = 3, # cubic spline
        intercept = TRUE)

This is how cubic spline with 7 knots look like

We will now add this B matrix into the data to be able to further use it in the models

d2 <-
  d %>% 
  mutate(B = B) 

# take a look at the structure of `d3
d2 %>% glimpse()
Rows: 2,556
Columns: 12
$ Participant    <fct> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
$ Concept        <fct> 1, 1, 2, 2, 3, 3, 4, 6, 6, 8, 8, 9, 9, 14, 15, 15, 17, …
$ Modality       <fct> combined, combined, gesture, gesture, vocal, vocal, com…
$ Big5           <dbl> 1.919899, 1.919899, 1.919899, 1.919899, 1.919899, 1.919…
$ Familiarity    <dbl> 1.8439059, 1.8439059, 1.8439059, 1.8439059, 1.8439059, …
$ Expressibility <dbl> 0.576172309, 0.576172309, 0.349378655, 0.349378655, 0.3…
$ CommAtt        <fct> 2, 3, 2, 3, 2, 3, 2, 2, 3, 2, 3, 2, 3, 2, 2, 3, 2, 3, 2…
$ Eff            <dbl> 6.566250, 2.367010, 10.367605, 3.998448, 12.934986, 3.5…
$ TrialNumber    <dbl> 2, 3, 5, 6, 8, 9, 11, 14, 15, 18, 19, 21, 22, 28, 30, 3…
$ PrevAn         <dbl> 0.98450637, 0.82035722, 0.48565959, 0.09090184, 0.19707…
$ Effort_Change  <dbl> 2.222401, -4.199240, 4.869752, -6.369157, 6.781893, -9.…
$ B              <bs[,9]> <bs[26 x 9]>

The B matrix is now a matrix column which contains the same number of rows as the others, but has also 9 columns within that columns. Each of them correspond to one synthetic B variable.

We can now use this matrix to fit our model

h2.s1 <- 
  brm(data = d2,
      family = gaussian,
      Effort_Change ~ 1 + B,
      prior = c(prior(normal(100, 10), class = Intercept),
                prior(normal(0, 10), class = b),
                prior(exponential(1), class = sigma)),
      iter = 4000, warmup = 2000, chains = 4, cores = 4,
      seed = 4)


# Add criterions for later diagnostics
h2.s1 <- add_criterion(h2.s1, criterion = c("loo", "waic"))

# Calculate also variance explained (R^2)
h2.s1_R2 <- bayes_R2(h2.s1)

# Save both as objects
saveRDS(h2.s1, here("09_Analysis_Modeling", "models", "h2.s1.rds"))
saveRDS(h2.s1_R2, here("09_Analysis_Modeling", "models", "h2.s1_R2.rds"))

beep(5)
h2.s1 <- readRDS(here("09_Analysis_Modeling", "models", "h2.s1.rds"))
h2.s1_R2 <- readRDS(here("09_Analysis_Modeling", "models", "h2.s1_R2.rds"))

# Summary
print(h2.s1)
 Family: gaussian 
  Links: mu = identity; sigma = identity 
Formula: Effort_Change ~ 1 + B 
   Data: d2 (Number of observations: 2556) 
  Draws: 4 chains, each with iter = 4000; warmup = 2000; thin = 1;
         total post-warmup draws = 8000

Regression Coefficients:
          Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept     0.60      3.29    -5.87     6.72 1.00     1022     1772
B1            0.54      3.32    -5.66     7.08 1.00     1043     1951
B2            0.15      3.34    -6.13     6.70 1.00     1044     1856
B3            1.83      3.34    -4.47     8.44 1.00     1057     1901
B4           -0.33      3.32    -6.66     6.19 1.00     1042     1844
B5            0.42      3.31    -5.80     6.88 1.00     1041     1814
B6           -0.81      3.31    -7.00     5.71 1.00     1036     1880
B7           -0.57      3.33    -6.88     5.95 1.00     1049     1780
B8           -0.89      3.34    -7.14     5.67 1.00     1067     2043
B9           -1.21      3.32    -7.50     5.35 1.00     1039     1822

Further Distributional Parameters:
      Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma     3.75      0.05     3.65     3.85 1.00     4696     4264

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

It’s a bit difficult to see what is going on so let’s just plot it

Now with the predictor

So as expected, we see quite linear decrease similar to our previous models

Now let’s check some diagnostics

plot(h2.s1)

# looks good

pp_check(h2.s1, type = "dens_overlay")

# because we used only main predictor, we can see quite a bad result of the posterior predictive distribution, but we will work on that

pp_check(h2.s1, type = "error_scatter_avg")

# high residual error for both low and high values

h2.s1_R2
    Estimate   Est.Error       Q2.5      Q97.5
R2 0.0299754 0.006354033 0.01827862 0.04305631
# explained variance 2%

Splines 2 - Bayesian GAMs

Now we will use a different workflow, using smooth functions with brms package. Brms allow for non-linear models by borrowing functions from mgcv package (Wood 2017)

This is how priors should look like for one of these functions

get_prior(data = d2,
          family = gaussian,
          Effort_Change ~ 1 + s(PrevAn))
                  prior     class      coef group resp dpar nlpar lb ub
                 (flat)         b                                      
                 (flat)         b sPrevAn_1                            
 student_t(3, 2.1, 2.7) Intercept                                      
   student_t(3, 0, 2.7)       sds                                  0   
   student_t(3, 0, 2.7)       sds s(PrevAn)                        0   
   student_t(3, 0, 2.7)     sigma                                  0   
       source
      default
 (vectorized)
      default
      default
 (vectorized)
      default

At this point, I will just keep priors at default

h2.s2 <-
  brm(data = d2,
      family = gaussian,
      Effort_Change ~ 1 + s(PrevAn, bs = "bs", k = 19),
      iter = 4000, 
      warmup = 2000, 
      chains = 4, 
      cores = 4,
      seed = 4,
      control = list(adapt_delta = .99)
      )

# Add criterions for later diagnostics
h2.s2 <- add_criterion(h2.s2, criterion = c("loo", "waic"))

# Calculate also variance explained (R^2)
h2.s2_R2 <- bayes_R2(h2.s2)

# Save both as objects
saveRDS(h2.s2, here("09_Analysis_Modeling", "models", "h2.s2.rds"))
saveRDS(h2.s2_R2, here("09_Analysis_Modeling", "models", "h2.s2_R2.rds"))


beep(5)

Now we can proceed as we usually do

h2.s2 <- readRDS(here("09_Analysis_Modeling", "models", "h2.s2.rds"))
h2.s2_R2 <- readRDS(here("09_Analysis_Modeling", "models", "h2.s2_R2.rds"))


# Summary
summary(h2.s2)
 Family: gaussian 
  Links: mu = identity; sigma = identity 
Formula: Effort_Change ~ 1 + s(PrevAn, bs = "bs", k = 19) 
   Data: d2 (Number of observations: 2556) 
  Draws: 4 chains, each with iter = 4000; warmup = 2000; thin = 1;
         total post-warmup draws = 8000

Smoothing Spline Hyperparameters:
               Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sds(sPrevAn_1)     0.04      0.04     0.00     0.15 1.00     2227     4150

Regression Coefficients:
          Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept     0.55      0.07     0.40     0.69 1.00    10619     5777
sPrevAn_1    -0.30      0.05    -0.39    -0.21 1.00     6524     5066

Further Distributional Parameters:
      Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma     3.75      0.05     3.65     3.86 1.00    10144     5898

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
plot(h2.s2)

# looks good

plot(conditional_effects(h2.s2), points = TRUE)

# again, we see similar (non-linear) trend 

pp_check(h2.s2, type = "dens_overlay")

# same as before, ignoring bimodality

pp_check(h2.s2, type = "error_scatter_avg")

# still weird

h2.s2_R2
     Estimate   Est.Error       Q2.5      Q97.5
R2 0.02581913 0.006114862 0.01489172 0.03874999
# explained variance 2%

Splines 3 - GAMs with f+r effects

Now let’s use the same model, but adding our other predictors

h2.s3 <- 
  brm(
    data = filtered_data,
    family = gaussian,
    Effort_Change ~ 1 +
      s(PrevAn_z, bs = "bs", k = 19) +  # Smooth for Previous Answer similarity
      + CommAtt + Modality + Big5 + Familiarity + Expressibility_z +  # Fixed effects
      (1 | Participant) + (1 | Concept),  # Random effects
    iter = 4000, 
    warmup = 2000, 
    chains = 4, 
    cores = 4,
    seed = 4,
    control = list(adapt_delta = .99)
  )


# Add criterions for later diagnostics
h2.s3 <- add_criterion(h2.s3, criterion = c("loo", "waic"))

# Calculate also variance explained (R^2)
h2.s3_R2 <- bayes_R2(h2.s3)

# Save both as objects
saveRDS(h2.s3, here("09_Analysis_Modeling", "models", "h2.s3.rds"))
saveRDS(h2.s3_R2, here("09_Analysis_Modeling", "models", "h2.s3_R2.rds"))

beep(5)
h2.s3 <- readRDS(here("09_Analysis_Modeling", "models", "h2.s3.rds"))
h2.s3_R2 <- readRDS(here("09_Analysis_Modeling", "models", "h2.s3_R2.rds"))


# Summary
summary(h2.s3)
 Family: gaussian 
  Links: mu = identity; sigma = identity 
Formula: Effort_Change ~ 1 + s(PrevAn_z, bs = "bs", k = 19) + +CommAtt + Modality + Big5 + Familiarity + Expressibility_z + (1 | Participant) + (1 | Concept) 
   Data: filtered_data (Number of observations: 2556) 
  Draws: 4 chains, each with iter = 4000; warmup = 2000; thin = 1;
         total post-warmup draws = 8000

Smoothing Spline Hyperparameters:
                 Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sds(sPrevAn_z_1)     0.01      0.01     0.00     0.03 1.00     3075     4614

Multilevel Hyperparameters:
~Concept (Number of levels: 21) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)     0.03      0.02     0.00     0.08 1.00     5219     4819

~Participant (Number of levels: 120) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)     0.02      0.02     0.00     0.07 1.00     6622     4876

Regression Coefficients:
                 Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept           -0.66      0.03    -0.72    -0.61 1.00    15652     5457
CommAtt1             3.73      0.03     3.68     3.78 1.00    20938     5112
Modality1           -0.14      0.07    -0.27    -0.00 1.00    14396     6300
Modality2            0.01      0.07    -0.12     0.15 1.00    15298     7228
Big5                 0.14      0.04     0.05     0.23 1.00    18750     5647
Familiarity          0.15      0.05     0.06     0.24 1.00    22307     5918
Expressibility_z     0.08      0.03     0.02     0.14 1.00    21298     5485
sPrevAn_z_1         -0.32      0.01    -0.35    -0.30 1.00    17636     6556

Further Distributional Parameters:
      Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma     1.25      0.02     1.21     1.28 1.00    22077     5247

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
plot(h2.s3)

# looks ok

plot(conditional_effects(h2.s3), points = TRUE)

# now the ffect of PrevAn looks a bit different because we are back to using z-scored version

pp_check(h2.s3, type = "dens_overlay")

# looks ok but the ppd undershoots the low values and overshoot the high values, similar to the linear regression

pp_check(h2.s3, type = "error_scatter_avg")

# still not what we would like

h2.s3_R2
   Estimate  Est.Error      Q2.5     Q97.5
R2 0.892084 0.00136303 0.8893236 0.8945848
# 89% variance

Diagnostics I

Before we proceed further, let’s do first round of diagnostics

Rhat

# Extract R-hat values for each model
rhat_list <- lapply(model_list, function(model) {
  rhat_values <- rhat(model)
  data.frame(model = deparse(substitute(model)), 
             max_rhat = max(rhat_values), 
             min_rhat = min(rhat_values))
})

# Combine and inspect
do.call(rbind, rhat_list)
   model max_rhat  min_rhat
1 X[[i]] 1.003546 0.9996569
2 X[[i]] 1.003278 1.0003198
3 X[[i]] 1.001547 1.0000413
4 X[[i]] 1.002112 0.9996489

All Rhat values look good

ESS

Effective sample size tells how many independent samples the model has effectively drawn from the PD. Low ESS suggests autocorrelation (i.e., sample explores one part of posterior), while high ESS means good mix

# Extract n_eff values for each model
neff_ratio_list <- lapply(model_list, function(model) {
  neff_values <- neff_ratio(model)              # Here we calculate ratio (not the raw number of effective samples)
  data.frame(model = deparse(substitute(model)), 
             min_neff = min(neff_values), 
             max_neff = max(neff_values),
             mean_neff = mean(neff_values))
               
})

# Combine and inspect
do.call(rbind, neff_ratio_list)
   model  min_neff  max_neff mean_neff
1 X[[i]] 0.3290114 0.9137058 0.8042730
2 X[[i]] 0.1277981 0.5329452 0.2049973
3 X[[i]] 0.2784304 0.7904525 0.6233618
4 X[[i]] 0.3166404 1.0163927 0.8098956

They all look good expect the very first non-linear model h2.s1 which is quite expectable since we used only main predictor. Linear regression model and GAMs model have highest ESS.

effective_sample(h2.m1) 
           Parameter   ESS
1        b_Intercept 13719
2         b_PrevAn_z 17596
3         b_CommAtt1 19690
4      b_Familiarity 16236
5             b_Big5 16183
6 b_Expressibility_z 18846
7    b_TrialNumber_c 11127
8        b_Modality1 13781
9        b_Modality2 11755
effective_sample(h2.s3) 
           Parameter   ESS
1        b_Intercept 15467
2         b_CommAtt1 20347
3        b_Modality1 14550
4        b_Modality2 15207
5             b_Big5 18402
6      b_Familiarity 22096
7 b_Expressibility_z 20936
8     bs_sPrevAn_z_1 17735

LOO & WAIC

l <- loo_compare(h2.m1, h2.s1, h2.s2, h2.s3, criterion = "loo")

print(l, simplify = F)
      elpd_diff se_diff elpd_loo se_elpd_loo p_loo   se_p_loo looic   se_looic
h2.m1     0.0       0.0 -4201.7     48.1        14.8     0.7   8403.4    96.2 
h2.s3    -0.1       0.7 -4201.8     48.1        15.2     0.7   8403.6    96.3 
h2.s2 -2808.1      38.3 -7009.8     26.4         4.8     0.1  14019.6    52.7 
h2.s1 -2810.8      38.3 -7012.5     26.6         9.6     0.3  14025.0    53.3 

Here again we see that GAM and LR model have best performance assessed by LOO.

w <- loo_compare(h2.m1, h2.s1, h2.s2, h2.s3, criterion = "waic")

print(w, simplify = F)
      elpd_diff se_diff elpd_waic se_elpd_waic p_waic  se_p_waic waic   
h2.m1     0.0       0.0 -4201.7      48.1         14.8     0.7    8403.4
h2.s3    -0.1       0.7 -4201.8      48.1         15.2     0.7    8403.5
h2.s2 -2808.1      38.3 -7009.8      26.4          4.8     0.1   14019.6
h2.s1 -2810.8      38.3 -7012.5      26.6          9.6     0.3   14025.0
      se_waic
h2.m1    96.2
h2.s3    96.3
h2.s2    52.7
h2.s1    53.3
# see Solomon Kurz
cbind(waic_diff = w[,1] * -2,
      se = w[,2] * 2)
         waic_diff        se
h2.m1    0.0000000  0.000000
h2.s3    0.1437139  1.412563
h2.s2 5616.2358401 76.619424
h2.s1 5621.6500755 76.529817

Plot the comparison

Here we see identical results

model_weights(h2.m1, h2.s1, h2.s2, h2.s3, weights = "waic") %>% 
  round(digits = 2)
h2.m1 h2.s1 h2.s2 h2.s3 
 0.52  0.00  0.00  0.48 

For the synthetic data, GAMs are probably adding unnecessary complexity as we are not gaining much more explanatory power. However, we leave it open whether linear or non-linear models will be better suited for the real data, following this pipeline.

Adding priors

Similarly to H1, we will also set mildly informative priors. Let’s check what priors have been selected as default for h2.s3

# Print priors
prior_summary(h2.s3)
                  prior     class                           coef       group
                 (flat)         b                                           
                 (flat)         b                           Big5            
                 (flat)         b                       CommAtt1            
                 (flat)         b               Expressibility_z            
                 (flat)         b                    Familiarity            
                 (flat)         b                      Modality1            
                 (flat)         b                      Modality2            
                 (flat)         b                    sPrevAn_z_1            
 student_t(3, 2.1, 2.7) Intercept                                           
   student_t(3, 0, 2.7)        sd                                           
   student_t(3, 0, 2.7)        sd                                    Concept
   student_t(3, 0, 2.7)        sd                      Intercept     Concept
   student_t(3, 0, 2.7)        sd                                Participant
   student_t(3, 0, 2.7)        sd                      Intercept Participant
   student_t(3, 0, 2.7)       sds                                           
   student_t(3, 0, 2.7)       sds s(PrevAn_z, bs = "bs", k = 19)            
   student_t(3, 0, 2.7)     sigma                                           
 resp dpar nlpar lb ub       source
                            default
                       (vectorized)
                       (vectorized)
                       (vectorized)
                       (vectorized)
                       (vectorized)
                       (vectorized)
                       (vectorized)
                            default
                  0         default
                  0    (vectorized)
                  0    (vectorized)
                  0    (vectorized)
                  0    (vectorized)
                  0         default
                  0    (vectorized)
                  0         default

For the flat priors that have been selected as default, we can re-use our previous priors from H1. The priors could therefore look somewhat like this

priors_h2s4 <- c(
  set_prior("normal(0,0.50)", class = "b", coef = "sPrevAn_z_1"),
  set_prior("normal(0,0.50)", class = "b", coef = "CommAtt2M1"),
  set_prior("normal(0,0.25)", class = "b", coef = "Modality1"),
  set_prior("normal(0,0.25)", class = "b", coef = "Modality2"),
  set_prior("normal(0,0.25)", class = "b", coef = "Big5"),
  set_prior("normal(0,0.25)", class = "b", coef = "Familiarity"),
  set_prior("normal(0,0.25)", class = "b", coef = "Expressibility_z")
)

Adding Interactions

Because we are currently already reaching the ceiling of R^2, we are not going to add more interactions. However, we are not excluding the option of adding few interactions when using the real data. Possible interactions include:

  • PrevAn x Modality - similarity affects the change in effort differently (e.g., vocal might still require more effort)

  • PrevAn x Expressibility - similarity in relation to effort might matter only for highly expressible concepts, and low expressible concepts are difficult to express, so also difficult to exaggerate

  • Familiarity x PrevAn - if the guess is really bad, only very familiar people might be motivated enough to put more effort

  • Big x PrevAn - same like with familiarity

Similar to the workflow adopted in H1 modelling, we would fit two new models, one with priors and one with interactions, and perform another round of diagnostics to see which model seems to have the most predictive power.

Session info

Here is session information for reproducibility:

R version 4.4.1 (2024-06-14 ucrt)
Platform: x86_64-w64-mingw32/x64
Running under: Windows 10 x64 (build 19045)

Matrix products: default


locale:
[1] LC_COLLATE=German_Germany.utf8  LC_CTYPE=German_Germany.utf8   
[3] LC_MONETARY=German_Germany.utf8 LC_NUMERIC=C                   
[5] LC_TIME=German_Germany.utf8    

time zone: Europe/Berlin
tzcode source: internal

attached base packages:
[1] splines   stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
 [1] emmeans_1.11.0    lubridate_1.9.4   forcats_1.0.0     stringr_1.5.1    
 [5] purrr_1.0.4       readr_2.1.5       tidyverse_2.0.0   bayestestR_0.15.2
 [9] beepr_2.0         brms_2.22.0       Rcpp_1.0.14       bayesplot_1.11.1 
[13] dagitty_0.3-4     ggdag_0.2.13      patchwork_1.3.0   rcartocolor_2.1.1
[17] tibble_3.2.1      ggplot2_3.5.1     tidyr_1.3.1       dplyr_1.1.4      
[21] here_1.0.1       

loaded via a namespace (and not attached):
 [1] gridExtra_2.3        inline_0.3.21        rlang_1.1.5         
 [4] magrittr_2.0.3       matrixStats_1.5.0    compiler_4.4.1      
 [7] mgcv_1.9-1           loo_2.8.0            vctrs_0.6.5         
[10] reshape2_1.4.4       pkgconfig_2.0.3      crayon_1.5.3        
[13] fastmap_1.2.0        backports_1.5.0      labeling_0.4.3      
[16] ggraph_2.2.1         utf8_1.2.4           rmarkdown_2.29      
[19] tzdb_0.5.0           bit_4.6.0            xfun_0.52           
[22] cachem_1.1.0         jsonlite_2.0.0       tweenr_2.0.3        
[25] parallel_4.4.1       R6_2.6.1             stringi_1.8.7       
[28] StanHeaders_2.32.10  boot_1.3-30          estimability_1.5.1  
[31] rstan_2.32.7         knitr_1.50           audio_0.1-11        
[34] Matrix_1.7-0         igraph_2.1.4         timechange_0.3.0    
[37] tidyselect_1.2.1     abind_1.4-8          yaml_2.3.10         
[40] viridis_0.6.5        codetools_0.2-20     curl_6.2.2          
[43] pkgbuild_1.4.7       lattice_0.22-6       plyr_1.8.9          
[46] withr_3.0.2          bridgesampling_1.1-2 posterior_1.6.1     
[49] coda_0.19-4.1        evaluate_1.0.3       RcppParallel_5.1.10 
[52] polyclip_1.10-7      pillar_1.10.2        tensorA_0.36.2.1    
[55] checkmate_2.3.2      stats4_4.4.1         insight_1.1.0       
[58] distributional_0.5.0 generics_0.1.3       vroom_1.6.5         
[61] rprojroot_2.0.4      hms_1.1.3            rstantools_2.4.0    
[64] munsell_0.5.1        scales_1.3.0         xtable_1.8-4        
[67] glue_1.8.0           tools_4.4.1          mvtnorm_1.3-3       
[70] graphlayouts_1.2.2   tidygraph_1.3.1      grid_4.4.1          
[73] QuickJSR_1.7.0       colorspace_2.1-1     nlme_3.1-164        
[76] ggforce_0.4.2        cli_3.6.4            viridisLite_0.4.2   
[79] Brobdingnag_1.2-9    V8_6.0.3             gtable_0.3.6        
[82] digest_0.6.37        ggrepel_0.9.6        htmlwidgets_1.6.4   
[85] farver_2.1.2         memoise_2.0.1        htmltools_0.5.8.1   
[88] lifecycle_1.0.4      bit64_4.6.0-1        MASS_7.3-60.2       

References

Bürkner, Paul-Christian. 2017. brms: An R Package for Bayesian Multilevel Models Using Stan.” Journal of Statistical Software 80 (1): 1–28. https://doi.org/10.18637/jss.v080.i01.
McElreath, Richard. 2018. Statistical Rethinking: A Bayesian Course with Examples in R and Stan. New York: Chapman and Hall/CRC. https://doi.org/10.1201/9781315372495.
Morris, Tim P., Ian R. White, and Michael J. Crowther. 2019. “Using Simulation Studies to Evaluate Statistical Methods.” Statistics in Medicine 38 (11): 2074–2102. https://doi.org/10.1002/sim.8086.
Wood, S. N. 2017. Generalized Additive Models: An Introduction with R. 2nd ed. Chapman; Hall/CRC.