```
# Sum Coding (vs. mean)
options(contrasts = c("contr.sum", "contr.poly"))
```

# Mixed-Efects Logistic Regression Analysis: Part 2

*Goldvarb*analysis. Part 2: Sum Contrast Coding

# Sum Contrasts (vs. mean)

Before you proceed with this section, please make sure that you have your data loaded and modified based on the code here and that `Dep.Var`

is re-coded such that `Deletion`

is the second factor. Next, you set the global *R* options to employ sum contrast coding.

Now you are ready to create a mixed-effects logistic regression model that is comparable to the model produced by *Goldvarb*.

## Building Your Model

The next step is creating the mixed-effects model. The following code tests the fixed effects of preceding phonological context (`Before`

), following phonological context (`After.New`

), morphological status (`Morph.Type`

), lexical stress of the syllable (`Stress`

), underlying phoneme (`Phoneme`

), speaker age (`Centre.Age`

), speaker sex (`Sex`

) and speaker education level (`Education`

[^2]), on the deletion of (t ,d) in the data set. It also takes into account the potential random effect of speaker (`Speaker`

[^3]). The function for creating this model, `glmer()`

(for Generalized Linear Mixed Effects model with Random effects, what I call the “glimmer” [glɪmɚ] function) is part of the `lme4`

package.

Based on the random forest analysis performed in Random Forests: The Basics, you know that `After`

does a better job of explaining the variation than `After.New`

; however, you want to make your analysis comparable to analyses in the sociolinguistic literature that do not single out pre-/h/ contexts, so you include `After`

in the analysis. See also *Re-coding Variables* in Modifying Your Data.

Additionally, the random forest analysis indicated that `Job`

does a better job than Education; however, you may be specifically interested in education level, so you may choose this variable instead.

It is also possible to include interaction groups in the model. For example, you could include the interaction group (`Age_Sex`

), or you could tell *R* to make an *ad hoc* interaction group by specifying `Age*Sex`

as a predictor in the model. I won’t discuss interactions here, but you can learn all about them from the very well-written *Notes on Interactions* by Derek Denis, available here. They are also discussed in Part 3. The interpretation of interaction groups for *Rbrul* and in a sum contrast `glmer()`

models is identical.

Here is the code for generating the `glmer()`

analysis.

```
# Generalized linear mixed effects model with the
# fixed main effects of Before, After.New,
# Morph.Type, Stress, Phoneme, Centre.Age, Sex
# and Education, and the random effect of Speaker
library(lme4)
<- glmer(Dep.Var ~ Before + After.New + Morph.Type +
td.glmer + Phoneme + Center.Age + Sex + Education +
Stress 1 | Speaker), data = td, family = "binomial",
(control = glmerControl(optCtrl = list(maxfun = 20000),
optimizer = "bobyqa"))
```

As with the `ctree()`

function, you construct your `gmler()`

model by first specifying the dependent variable, here `Dep.Var`

, then using `~`

to indicate that everything to the right is a potential predictor of your dependent variable (e.g., the variable on the left varies as a function of the variables on the right). The predictors are separated by a `+`

. You specify that `Speaker`

is a random effect by enclosing it in `(1| )`

. Here the `1`

simply indicates the model’s intercept. You are essentially telling *R* to assume a different intercept (i.e., baseline likelihood of `Deletion`

) for each level of `Speaker`

. This effectively resolves the non-independence that stems from having multiple tokens by the same speaker. If you wanted to include both speaker and word as random effects, assuming you had columns called `Speaker`

and `Word`

, you could specify `+ (1|Speaker) + (1|Word)`

in your function. If you do not want any random effects in your model, you cannot use `glmer()`

. Instead, you must use `glme()`

.

After specifying your predictors, you indicate that `family = "binomial"`

because you are looking at the binary choice between `Deletion`

and `Realization`

. The specification `control = glmerControl(optCtrl = list(maxfun = 2e4), optimizer = "bobyqa")`

simply tweaks how many function evaluations the `glmer()`

optimizer will try before giving up and declaring non-convergence with an error message. You don’t need to use these specifications. If you don’t, you may get non-convergence warnings — but even if you do, that isn’t necessarily the end of the world. As long as the reason you’re getting the the non-convergence warnings is NOT because of singletons or knockouts in some cells (as a good sociolinguist I know you’ve weeded all of these out based on your cross-tabs), a model with a non-convergence warning like `Model failed to converge with max|grad| = 0.0259806 (tol = 0.001, component 1)`

will still yield explanatory, albeit sub-optimal, test statistic values.

There are several things that will cause the model not to converge (i.e., fail). The first (and most common cause) is that your model is too complex. Complexity arises from having too many potential predictors or too many levels within each predictor. This complexity is more pernicious if your data set is small. Tweaking the `glmer()`

controls can help, but it won’t always overcome extreme complexity. The first step, then, when dealing with non-convergence is thinking (from a theoretical perceptive) how you can simplify your model. Using a Conditional Inference Tree or Random Forest analyses can help — so can a really thorough exploration of you data using cross tabs. Cross-tabs especially can help you find whether you have **singletons** or **knockouts**. These terms are hold-overs from *Goldvarb* for phenomena in your data that can cause non-convergence, but they can also cause non-convergence in a `glmer()`

model.

The following will cause non-convergence or skewed results in your regression analysis. :

**singleton**— a single-level predictor variable and/or its one level. In the partition`td.young`

the predictor`Age.Group`

is a singleton because the only value is`Young`

. Solution: don’t include this predictor in your model.**knockout**— when a level of a predictor variable always (100% of tokens) or never (0% tokens) occurs with the application value of the dependent variable. Solution: don’t include this level in your model (but account for it in your description of the data), or re-code in a thoeortetically-motivated way.

In the code above you used the `<-`

function to assign your model to the object `td.glmer`

. To see the results of the model, use the `summary()`

function on the model object.

`summary(td.glmer)`

```
Generalized linear mixed model fit by maximum likelihood (Laplace
Approximation) [glmerMod]
Family: binomial ( logit )
Formula: Dep.Var ~ Before + After.New + Morph.Type + Stress + Phoneme +
Center.Age + Sex + Education + (1 | Speaker)
Data: td
Control: glmerControl(optCtrl = list(maxfun = 20000), optimizer = "bobyqa")
AIC BIC logLik deviance df.resid
1111.1 1192.4 -539.6 1079.1 1173
Scaled residuals:
Min 1Q Median 3Q Max
-5.0817 -0.4936 -0.2554 0.4880 15.0593
Random effects:
Groups Name Variance Std.Dev.
Speaker (Intercept) 0.6459 0.8036
Number of obs: 1189, groups: Speaker, 66
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.255788 0.202133 -1.265 0.20571
Before1 -0.563649 0.202605 -2.782 0.00540 **
Before2 0.542851 0.193737 2.802 0.00508 **
Before3 0.102101 0.278658 0.366 0.71407
Before4 0.720732 0.190146 3.790 0.00015 ***
After.New1 1.839172 0.157358 11.688 < 2e-16 ***
After.New2 -1.168199 0.144397 -8.090 5.96e-16 ***
Morph.Type1 0.423432 0.140168 3.021 0.00252 **
Morph.Type2 -1.882511 0.213596 -8.813 < 2e-16 ***
Stress1 -0.792893 0.137440 -5.769 7.97e-09 ***
Phoneme1 0.280468 0.127699 2.196 0.02807 *
Center.Age 0.005787 0.008441 0.686 0.49296
Sex1 -0.122564 0.150397 -0.815 0.41511
Education1 -0.178905 0.181832 -0.984 0.32517
Education2 0.647319 0.275276 2.352 0.01870 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
```

```
Correlation matrix not shown by default, as p = 15 > 12.
Use print(x, correlation=TRUE) or
vcov(x) if you need it
```

## Interpreting Your Model, Getting Constraint Hierarchy

Now that you have the model, what does it tell you? There are all sorts of details in the `summary(td.glmer)`

output, but we’re first just going to focus on the the first few lines.

The beginning of the output simply tells you that you’ve completed a `Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) [glmerMod]`

. This is just name of the function you’ve just executed.

We are. (See here.) The basic idea behind **Generalized Linear Models** (not to be confused with General Linear Models) is to specify a **link function** that transforms the response space into a modelling space where we can perform a linear regression, and to capture the dependence of the variance on the mean through a **variance function**. A **Logistic regression**, then, is simply a linear regression analysis of binary data that has been first converted to the logit scale (thus making it “logistic”) and for which the variance function is the variance of the **binomial** distribution.

The key to understanding why we do this is that linear regression predicts the relationship between continuous, unbounded variables. This means that if we model the likelihood of a binary variable (e.g., \(0\) vs. \(1\)) using linear regression, the model will predict scenarios where the variable could be lower than \(0\) or higher than \(1\). This motivates the conversion of the binary variable onto the logit scale.

Usually we express the probability of the application value occurring as a proportion (number of tokens of the application value/total number of tokens). This proportion is bounded by \(0\) and \(1\). We can also talk about the odds of the application value occurring, which is the ratio of application vales to non-application values. Odds ratios, like proportions, are also bounded on one end, ranging from \(1\) to \(+\infty\). Odds ratios, however, can be converted to the logit scale (making them log odds), which allows us to consider this likelihood of the application value on a continuous scale (log odds range from \(-\infty\) to \(+\infty\)).

Probability, odds ratios, and log odds are all the same thing, just expressed in different ways. It’s similar to the idea of scientific notation: the number \(1,000\) can be written as \(1.0\times 10^3\) or even \(10\times 10\times 10\).

**Probability** is the probability that an event happens, i.e., that a token is the application value. For example, there are \(1189\) tokens, of which \(386\) are `Deletion`

. The proportion of deletion is \(386/1189\) or approximately \(0.32\). This means any given token has a \(32\%\) chance of being a `Deletion`

token.

**Odds** (more technically the odds of success) is defined as probability of success divided by the probability of failure. So the odds of a token being the application value (\(32\%\) chance of `Deletion`

) has an accompanying odds of failure (\(68\%\) chance of `Realization`

). Odds can be expressed as the ratio between these two, or as an **Odds Ratio**: \(0.32/0.68\) or approximately \(0.47\)

**Log odds** is the (natural) logarithm of the odds: \(log_e(0.47) = -0.75\). A logarithm is just another way to express an exponent: \(log_e(0.47) = -0.75\) is identical to \(e^{-0.75} = 0.47\), where \(e\) is Euler’s number, which is a mathematical constant used for this purpose (the first few numbers of which are \(2.718\)). Converting probabilities or odds ratios to log odds results in symmetry around zero, as shown in the following table:

Probability | Odds Ratio | Log Odds |
---|---|---|

\(0.10\) or \(10\%\) | \(0.111\) | \(-2.197\) |

\(0.20\) or \(20\%\) | \(0.250\) | \(-1.386\) |

\(0.30\) or \(30\%\) | \(0.428\) | \(-0.847\) |

\(0.40\) or \(40\%\) | \(0.667\) | \(-0.405\) |

\(0.50\) or \(50\%\) | \(1.000\) | \(0\) |

\(0.60\) or \(60\%\) | \(1.500\) | \(+0.406\) |

\(0.70\) or \(70\%\) | \(2.333\) | \(+0.847\) |

\(0.80\) or \(80\%\) | \(4.000\) | \(+1.386\) |

\(0.90\) or \(90\%\) | \(9.000\) | \(+2.197\) |

The next lines of the `summary(td.glmer)`

output is tells you the variance function `Family: binomial`

and the link function `(logit)`

and the formula used to construct the model `Formula: Dep.Var ~ Before + After.New + Morph.Type + Stress + Phoneme + Center.Age + Sex + Education + (1 | Speaker)`

. Next is the data `Data: td`

and the tweak you’ve made to the controls: `Control: glmerControl(optCtrl = list(maxfun = 20000), optimizer = "bobyqa")`

. This information is not new to you because it’s exactly what you specified.

You are then given some measures of model fit, including `AIC`

, `BIC`

, `logLik`

(log likelihood), and `deviance`

.^{1} These values measure how well your model predicts the actual values of your data. They are measures of prediction error. This is similar to the log-likelihood reported by *Goldvarb*. Higher values for these measures indicate a worse fit to the data, lower values indicate a better fit to the data. Following these measures you are given the degrees of freedom of the residuals `df.resid`

^{2} and then descriptors of the scaled residuals (`Min`

, `Max`

, and `Mean`

values and 1st and 3rd quartiles, `1Q`

and `3Q`

). The scaled residuals are simply a description of the variation that is not predicted by the model, or rather, the difference between the predicted and observed results. In large data sets these residuals should be normally distributed. These measures/residuals are more important for statisticians aiming to craft a model with the best possible fit to the data. They are also somewhat fuzzy to interpret for logistic regression modelling. For your purposes, where the goal is instead to test hypotheses or confirm trends, the goodness of fit of your model or the extent to which is explains all the data is only relevant insofar as it allows you to select the model built with the independent predictors (which you’ve selected to include in your analysis based of good theoretical linguistic/social reasoning) that best explain the variation. In other words, for you, a good model is not one that best fits the data, but rather that is the most sociolinguistically explanatory — that tells the story of the variation in the best possible way.

Including all the independent predictors you want to test is called creating a **full model** or **maximal model**. Once you start removing un-informative independent predictors from your model, or pruning it, you are entering the territory of model selection, which is as much an art as it is a science. Some statisticians recommend reporting on the full/maximal model, others (like Bates, Kleigl, Vasishth,and Baayen 2018) argue for reporting the most **parsimonious** or the least complex maximally predictive model. Depending on your goals, you may choose to report one or the other. For example, the maximal model may be useful when comparing the same regression analysis across multiple partitions/data sets.

Comparing measures of model fit can be useful when you have two potential predictors that are non-orthogonal (not independent) like education and employment type. You would not include both education and employment type in the same model because in many communities these two factors are not independent of each other. In Cape Breton, for example, white collar workers have higher education levels than blue collar workers. Including only one in a model is usually fine given that both are proxies for social status anyway. But which one do you choose to include?

One way to choose is to construct two identical models, one with `Education`

, one with `Job`

, and then compare how well each fits the data. If, for example, the model with `Education`

fits the data better, you could argue that education level does a better job of explaining the variation than employment type. You could use this same strategy if you wanted to compare models with different coding schemes for certain parameters (like `After`

and `After.New`

).

Comparing goodness of fit is not as easy as just comparing `AIC`

or `BIC`

, etc. though. Often values of goodness of fit measures that are very similar across models may in fact not be significantly different from one another given the differing number of parameter levels in each model. For example, the `AIC`

of the most parsimonious model above constructed with `After`

instead of `After.New`

is \(1049.9\) (\(13\) parameters). The `AIC`

of the model constructed with `After.New`

(which you’ll remember groups pre-/h/ contexts with other pre-consonantal contexts in order to compare with past research, see Modifying Data) is \(1113.8\) (\(12\) parameters). This lower `AIC`

with `After`

indicates that this model is a better fit to the data than the model constructed with `After.New`

. This is unsurprising given that /h/ disfavours `Deletion`

, but other consonants do not (see the Conditional Inference Tree analysis). The difference between the `AIC`

of the two models (given the difference of 1 parameter between them, i.e., degrees of freedom/df = 1) is statistically significantly greater than zero ( `Pr(>Chisq) = 4.645e-16`

or \(4.645\times 10^{-16}\), i.e., \(p<0.05\)). This can be determined using the function `anova(td.glmer1, td.glmer2)`

where `td.glmer1`

and `td.glmer2`

are the same model, but with one using `After`

and the other using `After.New`

. Note that the relevant function is `anova()`

, which is used for comparing models, and not `Anova()`

, which is used for evaluating the significance of fixed effects in a model.

```
<- glmer(Dep.Var ~ After + Morph.Type + Before +
td.glmer1 + Phoneme + (1 | Speaker), data = td, family = "binomial",
Stress control = glmerControl(optCtrl = list(maxfun = 20000),
optimizer = "bobyqa"))
<- glmer(Dep.Var ~ After.New + Morph.Type +
td.glmer2 + Stress + Phoneme + (1 | Speaker), data = td,
Before family = "binomial", control = glmerControl(optCtrl = list(maxfun = 20000),
optimizer = "bobyqa"))
anova(td.glmer1, td.glmer2)
```

```
Data: td
Models:
td.glmer2: Dep.Var ~ After.New + Morph.Type + Before + Stress + Phoneme + (1 | Speaker)
td.glmer1: Dep.Var ~ After + Morph.Type + Before + Stress + Phoneme + (1 | Speaker)
npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
td.glmer2 12 1113.8 1174.8 -544.92 1089.8
td.glmer1 13 1049.9 1115.9 -511.95 1023.9 65.942 1 4.645e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
```

You can visualize the model fit using the `binnedplot()`

function from the `arm`

package.

```
library(arm)
<- predict(td.glmer1)
x <- resid(td.glmer1)
y binnedplot(x, y)
```

In logistic regression, as with linear regression, the residuals are just the difference between the actual values and the values predicted by the model. Since the dependent variable is binary, the residuals will be binary too (either \(1\) or \(0\)), so plotting the raw residuals is not really that informative. The binned residuals plot above divides the data into categories (bins) based on their fitted (predicted) values and then plots the average residual versus the average fitted value for each bin. In the plot the grey lines indicate plus and minus \(2\) standard-error bounds. We expect about \(95\%\) of the binned residuals (black dots) to fall between the two grey lines if the model is actually true. By default, for data sets larger than \(100\) tokens, the number of bins is the square root of the total number of tokens. You can play with the number of bins with the option `nclass=`

.

Compare the two binned residual plots (above and below). You can see that for the `td.glmer2`

residual plot there are more black dots outside the grey lines, indicating an inferior fit.

```
library(arm)
<- predict(td.glmer2)
x <- resid(td.glmer2)
y binnedplot(x, y)
```

We can do the same thing, but instead testing the difference between models built using a discrete age predictor: `Age.Group`

, versus a continuous age predictor: `Center.Age`

.

```
<- glmer(Dep.Var ~ After + Morph.Type + Before +
td.glmer3 + Phoneme + Center.Age + (1 | Speaker),
Stress data = td, family = "binomial", control = glmerControl(optCtrl = list(maxfun = 20000),
optimizer = "bobyqa"))
<- glmer(Dep.Var ~ After + Morph.Type + Before +
td.glmer4 + Phoneme + Age.Group + (1 | Speaker), data = td,
Stress family = "binomial", control = glmerControl(optCtrl = list(maxfun = 20000),
optimizer = "bobyqa"))
anova(td.glmer3, td.glmer4)
```

```
Data: td
Models:
td.glmer3: Dep.Var ~ After + Morph.Type + Before + Stress + Phoneme + Center.Age + (1 | Speaker)
td.glmer4: Dep.Var ~ After + Morph.Type + Before + Stress + Phoneme + Age.Group + (1 | Speaker)
npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
td.glmer3 14 1051.0 1122.1 -511.48 1023.0
td.glmer4 15 1052.9 1129.1 -511.44 1022.9 0.0918 1 0.7619
```

The results of this `anova()`

show that the difference in fit of a model built with `Center.Age`

(`AIC`

\(= 1051.0\)) and `Age.Group`

(`AIC`

\(= 1052.9\)) is not significant (`Pr(>Chisq) = 0.7619`

, or \(p>0.05\)), or rather, the choice between the two is inconsequential to modelling the variation in the data.

In may also be useful to report in your manuscript that a model built with your fixed effects does a better job at predicting the variation than a model built with just the random effects (i.e., a **null model**). To make this comparison you build a model with no fixed effects and compare that using the `anova()`

function to your model with fixed effects.

```
<- glmer(Dep.Var ~ (1 | Speaker), data = td,
td.glmer.null family = "binomial", control = glmerControl(optCtrl = list(maxfun = 20000),
optimizer = "bobyqa"))
anova(td.glmer1, td.glmer.null)
```

```
Data: td
Models:
td.glmer.null: Dep.Var ~ (1 | Speaker)
td.glmer1: Dep.Var ~ After + Morph.Type + Before + Stress + Phoneme + (1 | Speaker)
npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
td.glmer.null 2 1455.8 1465.9 -725.88 1451.8
td.glmer1 13 1049.9 1115.9 -511.95 1023.9 427.86 11 < 2.2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
```

In a manuscript you would report that the model built with fixed effect predictors and the random effect of `Speaker`

(`AIC`

\(=1049.9\)) does a significantly better job at predicting the variation in the data than a null model built with just the random effect of `Speaker`

(`AIC`

\(=1455.8\); \(\chi^2=437.86\), \(\text{df}=11\), \(p<0.001\)).

An additional measure of the success of your model is the \(R^2\) value. This value tells you the proportion of the variability of the dependent variable that is explained by the independent predictors collectively. \(R^2\) squared is a useful metric for multiple linear regression and as such is often requested by reviewers. But \(R^2\) does not have the same meaning for logistic regression (binary dependant variables) as it does for linear regression (continuous dependant variables). Statisticians have come up with a variety of analogues of \(R^2\) for multiple logistic regression referred to collectively as “pseudo \(R^2\)”. Given that there are multiple methods of calculating \(R^2\), and that its use for non-linear models is still debated by statisticians, use and report it with a grain of salt.

The easiest way to calculate a (pseudo-)\(R^2\) value using the Nakagawa & Schielzeth’s (2012) method is to use the function `r.squaredGLMM()`

from the `MuMIn`

package.

`install.packages("MuMIn")`

```
library(MuMIn)
r.squaredGLMM(td.glmer)
```

```
R2m R2c
theoretical 0.4293394 0.5229847
delta 0.3626576 0.4417586
```

The `r.squaredGLMM()`

function returns a matrix with two calculations each for `R2m`

and `R2c`

. The first, `R2m`

or the marginal \(R^2\) value, represents the variance explained by the fixed effects alone. The function calculates this using two different methods. You can just look at the `theoretical`

calculation. It tells you that `0.43`

or \(43\%\) of the variance is explained by the fixed effects. The second set of values, the `R2c`

or the conditional \(R^2\) value, represents the variance that is explained by the fixed effects plus the random effects. Here `0.52`

or \(53\%\) of the variance is explained by the combination of fixed and random effects.

You cannot meaningfully compare model fit across different data sets. Identical tokens and an identical dependant variable must be included in the two models being compared. This is equally true for comparing AIC and \(R^2\).

### Random Effects

Lets look at the results of `summary(td.glmer)`

again.

`summary(td.glmer)`

```
Generalized linear mixed model fit by maximum likelihood (Laplace
Approximation) [glmerMod]
Family: binomial ( logit )
Formula: Dep.Var ~ Before + After.New + Morph.Type + Stress + Phoneme +
Center.Age + Sex + Education + (1 | Speaker)
Data: td
Control: glmerControl(optCtrl = list(maxfun = 20000), optimizer = "bobyqa")
AIC BIC logLik deviance df.resid
1111.1 1192.4 -539.6 1079.1 1173
Scaled residuals:
Min 1Q Median 3Q Max
-5.0817 -0.4936 -0.2554 0.4880 15.0593
Random effects:
Groups Name Variance Std.Dev.
Speaker (Intercept) 0.6459 0.8036
Number of obs: 1189, groups: Speaker, 66
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.255788 0.202133 -1.265 0.20571
Before1 -0.563649 0.202605 -2.782 0.00540 **
Before2 0.542851 0.193737 2.802 0.00508 **
Before3 0.102101 0.278658 0.366 0.71407
Before4 0.720732 0.190146 3.790 0.00015 ***
After.New1 1.839172 0.157358 11.688 < 2e-16 ***
After.New2 -1.168199 0.144397 -8.090 5.96e-16 ***
Morph.Type1 0.423432 0.140168 3.021 0.00252 **
Morph.Type2 -1.882511 0.213596 -8.813 < 2e-16 ***
Stress1 -0.792893 0.137440 -5.769 7.97e-09 ***
Phoneme1 0.280468 0.127699 2.196 0.02807 *
Center.Age 0.005787 0.008441 0.686 0.49296
Sex1 -0.122564 0.150397 -0.815 0.41511
Education1 -0.178905 0.181832 -0.984 0.32517
Education2 0.647319 0.275276 2.352 0.01870 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
```

```
Correlation matrix not shown by default, as p = 15 > 12.
Use print(x, correlation=TRUE) or
vcov(x) if you need it
```

After the measures of model fit is information about the random effects. In `td.glmer`

there is only one random effect: `Speaker`

. It is listed under `Groups`

because the model groups data by `Speaker`

. The `(Intercept)`

is listed under `Name`

because the model allows for variation of the `(Intercept)`

(i.e., baseline likelihood) by level of `Speaker`

. The likelihood of `Deletion`

for all levels of `Speaker`

considered together is found below under `Fixed Effect`

. It is the `Estimate`

value of `(Intercept)`

, e.g., -0.2557885 log odds. The `Variance`

and the `Std.Dev`

are two different ways of expressing how much the levels of `Speaker`

vary around this baseline value. The `Std.Dev`

is simply the square root of the `Variance`

(\(\sqrt{0.6459} = 0.8036\). There is no consensus among sociolinguistics as to whether to report the value for `Variance`

or `Std.Dev`

. I prefer `Std.Dev`

because it is the same units as the `(Intercept)`

. In a manuscript you can therefore report that the overall baseline probability of the `td.glmer`

model is \(-0.256\) log odds (\(\pm 0.806\) log odds, by speaker). I usually round my log odds to three places after the decimal; more precision is not needed in manuscripts.

Since we assume these likelihoods are normally-distributed, \(95\%\) of the speakers’ likelihoods will be within two standard deviations around the overall likelihood. We can calculate this using simple addition and subtraction, or we can calculate the range using an idealized normal distribution (using `qqnorm()`

). The results of these two calculations are slightly different as they are derived using somewhat different mathematical operations. For your purposes, just choose one method and stick with it. To make your calculations easier you can assign the overall likelihood and random effects standard deviation to their own variables.

```
# Calculating the 95% range for a normal
# distribution on the logit scale
# Assign overall likelihood and random effect
# standard deviations to their own variables
<- -0.255788
td.intercept <- 0.8036
td.rsd
# or
<- fixef(td.glmer)[1]
td.intercept <- sqrt(unlist(VarCorr(td.glmer)))
td.rsd
# Calculate +/- 2 standard deviations using a
# mathematical formula, lower then higher
- 2 * td.rsd td.intercept
```

```
(Intercept)
-1.863085
```

`+ 2 * td.rsd td.intercept `

```
(Intercept)
1.351508
```

```
# Calculate the 95% range (2.5% to 97.5%) using
# an idealized normal distribution on the logit
# scale
qnorm(c(0.025, 0.975), mean = td.intercept, sd = td.rsd)
```

`[1] -1.830910 1.319333`

The results of the calculations are reported in log odds. It may be more interpretable to report these values as probabilities.

*Goldvarb* reports **factor weights**, which are expressed as probabilities; the `glmer()`

function reports **log odds**.

To convert probabilities to log odds use the logit formula \(x=log(\frac{p}{1-p})\), where \(p\) is the probability and \(x\) is the log odds value. It is much easier, however, to just use the `logit()`

function.

```
library(car)
# Convert probabilities to log odds
logit(0.4)
```

`[1] -0.405`

To convert log odds to probabilities you can use the inverse logit formula \(p=\frac{e^x}{(1+e^x)}\), or the `inv.logit()`

function from the `boot`

package. (If you’ve still got the `car`

package loaded from earlier you may need to reload the `boot`

package.)

```
# Convert log odds to probabilities
library(boot)
inv.logit(-0.405)
```

`[1] 0.4`

```
# (Intercept) converted to probability
inv.logit(td.intercept)
```

```
(Intercept)
0.436
```

```
# 95% range converted to probabilities
inv.logit(qnorm(c(0.025, 0.975), mean = td.intercept,
sd = td.rsd))
```

`[1] 0.138 0.789`

Based on the above calculations, you can report in a manuscript that the mean baseline probability of `Deletion`

in the data is \(44\%\) and that the \(95\%\) range for individual speakers’ baseline probabilities is \(14\%\) to \(79\%\).

To get the baseline likelihood for individual speakers you can extract the random effect values using `ranef()`

.

```
# Get individual baseline likelihoods by speaker
ranef(td.glmer)
```

```
$Speaker
(Intercept)
ARSM91 -0.50258
BEAM91 -0.34013
BOUF65 -0.67444
CARM91 -0.59391
CHIF55 -0.11282
CLAF52 0.20791
CLAM73 0.10943
CONM89 0.25294
DAVM90 0.34813
DELF91 -0.47659
DONF15 0.13907
DONM41 0.04716
DONM53 -0.18729
DONM58 -0.63125
DOUF46 0.56661
ELLF29 -0.15042
ELLF61 -0.58827
EVAF92 -0.22506
FRAM93 -0.67112
GARF16 -0.19906
GARF37 -1.00238
GARF87 -0.18074
GARM42 -0.68499
GARM85 -0.58814
GAVF93 0.72170
GAVM90 -0.15733
GOUM91 -0.08336
GREF22 0.78227
GREM45 -0.37970
HANF83 -0.33334
HANM57 0.86675
HAWM90 1.12063
HOLF49 0.77544
HOLM52 0.08846
HUNF22 -0.46537
INGM84 1.12780
INGM87 -0.26438
JOCF91 -0.47713
JOYF91 -0.64378
KAYF29 0.06456
KAYM29 0.52023
LATF53 -1.14944
LELM91 -0.82195
LEOF66 -0.67818
MARM92 1.42939
MOFM55 -0.10666
MORF91 0.00951
NATF84 1.17572
NEIF49 0.21234
PACM94 0.08947
PEIF57 0.16229
PHAM91 -0.05544
ROBM64 0.27022
ROLF91 0.44736
RUDF73 0.25617
SAMF61 0.82955
SILM90 -0.76980
SMIF58 -0.63704
SMIM61 0.58311
STAM21 0.69893
STEF99 -0.56206
STEM42 0.08117
STEM65 -0.35627
TAMF91 0.69922
VICF91 1.54293
VIKF91 0.56214
with conditional variances for "Speaker"
```

For each individual speaker you add their random effect value to the overall baseline likelihood to get that speaker’s baseline likelihood. Then you convert the log odds to probability (here, arbitrarily using the `plogis()`

function, another option for converting log odds to probabilities). As always, you can nest these functions together.

```
# Get random effect for ARSM91
ranef(td.glmer)$Speaker["ARSM91", ]
```

`[1] -0.503`

```
# Calculate the sum of the random effect for
# ARSM91 and (Intercept)
sum(ranef(td.glmer)$Speaker["ARSM91", ], fixef(td.glmer)["(Intercept)"])
```

`[1] -0.758`

```
# Convert the result of the above function from
# log odds to probability using plogis()
plogis(sum(ranef(td.glmer)$Speaker["ARSM91", ], fixef(td.glmer)["(Intercept)"]))
```

`[1] 0.319`

The random effect for `ARSM91`

is \(-0.503\) from the overall baseline likelihood (e.g., the `(Intercept)`

estimate of \(-0.255788\) log odds). The combination of these is \(-0.758\) log odds. We can therefore report that the baseline probability of `Deletion`

for speaker `ARSM91`

is \(0.319\) or \(32\%\).

Below is a series of functions that extracts the coefficient (in log-odds) of the random intercept for each speaker and then adds next to those coefficients the frequency of the application value for each speaker, as well as that speaker’s total number of tokens. Finally it orders the speakers from lowest to highest random effect intercept coefficient. There is also an extra step to specify the order of the `Dep.Var`

factor because the following `table()`

function specifies the level to extract by number and you want to make sure that is `Deletion`

. The code is a little bit complex, but if you’ve been following along with this guide up until this point, you should be able to follow along with this code, step-by-step, too.

```
# Create column of Speakers with intercept
# coefficient
library(dplyr)
<- rownames_to_column(as.data.frame(ranef(td.glmer)$Speaker),
td.ranef "Speaker")
colnames(td.ranef)[2] <- "Intercept"
# Reorder levels of Dep.Var to make application
# value second
$Dep.Var <- factor(td$Dep.Var, levels = c("Realized",
td"Deletion"))
# Create column of Frequencies
<- rownames_to_column(as.data.frame(prop.table(table(td$Speaker,
speaker.prop $Dep.Var), 1)[, 2]), "Speaker")
tdcolnames(speaker.prop)[2] <- "Percent"
# Create column of token counts
<- as.data.frame(table(td$Speaker))
speaker.n colnames(speaker.n) <- c("Speaker", "Total N")
# Merge column of frequencies and column of token
# counts with column of Speakers
<- merge(td.ranef, speaker.prop, by = "Speaker")
td.ranef.speaker <- merge(td.ranef.speaker, speaker.n,
td.ranef.speaker by = "Speaker")
# Order data from lowest to highest Intercept,
# reset/delete row names
<- td.ranef.speaker[order(td.ranef.speaker$Intercept,
td.ranef.speaker $Percent), ]
td.ranef.speakerrownames(td.ranef.speaker) <- NULL
# Show final table, supress rownames
print(td.ranef.speaker, row.names = FALSE)
```

```
Speaker Intercept Percent Total N
LATF53 -1.14944 0.0625 16
GARF37 -1.00238 0.1429 28
LELM91 -0.82195 0.0000 12
SILM90 -0.76980 0.2222 18
GARM42 -0.68499 0.2308 13
LEOF66 -0.67818 0.2083 24
BOUF65 -0.67444 0.1765 17
FRAM93 -0.67112 0.1053 19
JOYF91 -0.64378 0.0556 18
SMIF58 -0.63704 0.2941 17
DONM58 -0.63125 0.3529 17
CARM91 -0.59391 0.1176 17
ELLF61 -0.58827 0.1200 25
GARM85 -0.58814 0.3333 9
STEF99 -0.56206 0.1875 16
ARSM91 -0.50258 0.1905 21
JOCF91 -0.47713 0.1176 17
DELF91 -0.47659 0.1111 18
HUNF22 -0.46537 0.0625 32
GREM45 -0.37970 0.3889 18
STEM65 -0.35627 0.0000 2
BEAM91 -0.34013 0.1250 16
HANF83 -0.33334 0.0000 4
INGM87 -0.26438 0.3000 20
EVAF92 -0.22506 0.2105 19
GARF16 -0.19906 0.3125 16
DONM53 -0.18729 0.3125 16
GARF87 -0.18074 0.1731 52
GAVM90 -0.15733 0.3684 19
ELLF29 -0.15042 0.3571 14
CHIF55 -0.11282 0.3000 20
MOFM55 -0.10666 0.2857 14
GOUM91 -0.08336 0.2222 18
PHAM91 -0.05544 0.2222 27
MORF91 0.00951 0.1875 16
DONM41 0.04716 0.4000 5
KAYF29 0.06456 0.4667 15
STEM42 0.08117 0.4667 15
HOLM52 0.08846 0.5000 16
PACM94 0.08947 0.2667 15
CLAM73 0.10943 0.5000 4
DONF15 0.13907 0.3929 28
PEIF57 0.16229 0.3529 17
CLAF52 0.20791 0.3529 17
NEIF49 0.21234 0.3529 17
CONM89 0.25294 0.3333 9
RUDF73 0.25617 0.4118 17
ROBM64 0.27022 0.4375 16
DAVM90 0.34813 0.3333 9
ROLF91 0.44736 0.3214 28
KAYM29 0.52023 0.6250 16
VIKF91 0.56214 0.3333 18
DOUF46 0.56661 0.4706 17
SMIM61 0.58311 0.6250 16
STAM21 0.69893 1.0000 2
TAMF91 0.69922 0.3571 14
GAVF93 0.72170 0.3889 18
HOLF49 0.77544 0.4444 18
GREF22 0.78227 0.5294 17
SAMF61 0.82955 0.5625 16
HANM57 0.86675 1.0000 3
HAWM90 1.12063 0.7222 18
INGM84 1.12780 0.5088 57
NATF84 1.17572 0.6875 16
MARM92 1.42939 0.5660 53
VICF91 1.54293 0.5882 17
```

If you look at the top (`head()`

) and bottom (`tail()`

) of this new table you can see that speakers `LATF53`

and `LELM91`

are the most likely to produce fully-realized (t, d) (even though, in the case of `LATF53`

, the frequency of `Deletion`

is not the lowest), while `VICF91`

and `MARM92`

are the most likely to delete (t, d). This is because the former have an overall higher baseline likelihood (`(Intercept)`

+ random effect estimate) and the latter have an overall lower baseline likelihood (`(Intercept)`

+ random effect estimate). This information could be very useful to your analysis.

```
# Show first six rows of td.ranef.speaker
head(td.ranef.speaker)
```

```
Speaker Intercept Percent Total N
1 LATF53 -1.149 0.0625 16
2 GARF37 -1.002 0.1429 28
3 LELM91 -0.822 0.0000 12
4 SILM90 -0.770 0.2222 18
5 GARM42 -0.685 0.2308 13
6 LEOF66 -0.678 0.2083 24
```

```
# Show last six rows of td.ranef.speaker
tail(td.ranef.speaker)
```

```
Speaker Intercept Percent Total N
61 HANM57 0.867 1.000 3
62 HAWM90 1.121 0.722 18
63 INGM84 1.128 0.509 57
64 NATF84 1.176 0.688 16
65 MARM92 1.429 0.566 53
66 VICF91 1.543 0.588 17
```

### Fixed Effects

Looking back again at `summary(td.glmer)`

, at the end of the details of the random effects you are presented with some useful information: `Number of obs: 1189, groups: Speaker, 66`

. This tells you the total number of tokens in your data set: `1189`

, and the total number of speakers: `66`

.

`summary(td.glmer)`

```
Generalized linear mixed model fit by maximum likelihood (Laplace
Approximation) [glmerMod]
Family: binomial ( logit )
Formula: Dep.Var ~ Before + After.New + Morph.Type + Stress + Phoneme +
Center.Age + Sex + Education + (1 | Speaker)
Data: td
Control: glmerControl(optCtrl = list(maxfun = 20000), optimizer = "bobyqa")
AIC BIC logLik deviance df.resid
1111 1192 -540 1079 1173
Scaled residuals:
Min 1Q Median 3Q Max
-5.082 -0.494 -0.255 0.488 15.059
Random effects:
Groups Name Variance Std.Dev.
Speaker (Intercept) 0.646 0.804
Number of obs: 1189, groups: Speaker, 66
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.25579 0.20213 -1.27 0.20571
Before1 -0.56365 0.20261 -2.78 0.00540 **
Before2 0.54285 0.19374 2.80 0.00508 **
Before3 0.10210 0.27866 0.37 0.71407
Before4 0.72073 0.19015 3.79 0.00015 ***
After.New1 1.83917 0.15736 11.69 < 2e-16 ***
After.New2 -1.16820 0.14440 -8.09 6e-16 ***
Morph.Type1 0.42343 0.14017 3.02 0.00252 **
Morph.Type2 -1.88251 0.21360 -8.81 < 2e-16 ***
Stress1 -0.79289 0.13744 -5.77 8e-09 ***
Phoneme1 0.28047 0.12770 2.20 0.02807 *
Center.Age 0.00579 0.00844 0.69 0.49296
Sex1 -0.12256 0.15040 -0.81 0.41511
Education1 -0.17890 0.18183 -0.98 0.32517
Education2 0.64732 0.27528 2.35 0.01870 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
```

```
Correlation matrix not shown by default, as p = 15 > 12.
Use print(x, correlation=TRUE) or
vcov(x) if you need it
```

Next you have the analysis of fixed effects. In the leftmost column you have a list of the levels of each parameter minus one. More on that in a moment. For each level there is an estimate value, also called the coefficient. This value, expressed in log odds, is like a factor weight. Unlike factor weights which are centred around \(0.5\) and range from \(0\) to \(1\), log odds are centred around \(0\) and range from \(+\infty\) to \(-\infty\). Parameter levels with positive polarity log odds favour the application value relative to that parameter’s baseline likelihood. Parameter levels with negative polarity log odds disfavour the application value relative to that parameter’s baseline likelihood.

The coefficient for the `(Intercept)`

, as described above, is the overall baseline likelihood. It is the likelihood, all things being equal, that any given token will have the application value rather than the non-application value. It is the mean of the baseline likelihoods of all the parameters in the model. It is just like the input value reported in *Goldvarb*. You can also refer to it as the centred mean. This value is usually reported in your manuscript as a probability. You can use the `inv.logit()`

function to convert it to a probability (see above).

After the `(Intercept)`

is the name of each predictor followed by a number (e.g., `Before1`

). Each number represents a different level of that predictor, but one level is missing. This is an annoying consequence of the `lme4`

package being built for the conventions of other disciplines where sum contrasts are less commonly used. The numbers correspond to the order of factors within the level. You can double-check this order using the function `levels()`

```
# Display the levels of a column Before
levels(td$Before)
```

```
[1] "Liquid" "Nasal" "Other Fricative" "S"
[5] "Stop"
```

The levels will always be in alphabetical order unless you explicitly change them. In your results, `Before1`

is `Liquid`

, `Before2`

is `Nasal`

, `Before3`

is `Other Fricative`

, and `Before4`

is `S`

. The “missing” level is the last level, `Stop`

. Because the log odds for all levels of a parameter are centred around the mean, you can actually calculate the estimate/coefficient for this last level. The sum off all coefficients for a single parameter will equal zero. Therefore the coefficient of the missing level will be 0 minus the sum of all the remaining coefficients for that parameter. So the estimate for `Stop`

is:

\[0= [\text{Before1}x + \text{Before2}x + \text{Before3}x + \text{Before4}x] + \text{Missing Coefficient}\\\]\[Thus...\]\[0- [\text{Before1}x + \text{Before2}x + \text{Before3}x + \text{Before4}x] =\text{Missing Coefficient}\]

We can extract the specific values using the `fixef()`

function and the position of the coefficients in the list.

```
# Get the coefficients for the fixed effects
fixef(td.glmer)
```

```
(Intercept) Before1 Before2 Before3 Before4 After.New1
-0.25579 -0.56365 0.54285 0.10210 0.72073 1.83917
After.New2 Morph.Type1 Morph.Type2 Stress1 Phoneme1 Center.Age
-1.16820 0.42343 -1.88251 -0.79289 0.28047 0.00579
Sex1 Education1 Education2
-0.12256 -0.17890 0.64732
```

```
# Subtract the sum of the coefficients from 0 by
# name
0 - sum(fixef(td.glmer)[c("Before1", "Before2", "Before3",
"Before4")])
```

`[1] -0.802`

```
# Subtract the sum of the coefficients from 0
# more easily by position
0 - sum(fixef(td.glmer)[2:5])
```

`[1] -0.802`

Using the `inv.logit()`

function, you can also calculate the probabilities (e.g., centered factor weights) for each of these parameter levels. We can adjust the number of significant digits so that *R* does your rounding automatically.

```
# Set number of significant digits to 2
options(digits = 2)
# Probability of Liquid
inv.logit(fixef(td.glmer)["Before1"])
```

```
Before1
0.36
```

```
# Probability of Nasal
inv.logit(fixef(td.glmer)["Before2"])
```

```
Before2
0.63
```

```
# Probability of Other Fricative
inv.logit(fixef(td.glmer)["Before3"])
```

```
Before3
0.53
```

```
# Probability of S
inv.logit(fixef(td.glmer)["Before4"])
```

```
Before4
0.67
```

```
# Probability of Stop
inv.logit(0 - sum(fixef(td.glmer)[2:5]))
```

`[1] 0.31`

Based on this calculation you now know that the constraint hierarchy based on factor weight-like probabilities for preceding segment is `S`

(\(0.67\)) > `Nasal`

(\(0.63\)) >`Other Fricative`

(\(0.53\)) > `Liquid`

(\(0.36\)) > `Stop`

(\(0.31\)). An easier way to get these values is with the combination of `plogis()`

, which converts log odds to probabilities like `inv.logit()`

, and `fct_rev()`

, which reverses the order of factors. Re-creating `td.glmer`

with all parameter levels being reversed means the final/“missing” levels in `td.glmer`

are now the first levels. So, for `td.glmer.reversed`

we only look at `fct_rev(Before)1`

, `fct_rev(Morph.Type)1`

, etc. This is a quick way to get the values for the missing levels.

```
# Get probabilities for all estimates in td.glmer
plogis(fixef(td.glmer))
```

```
(Intercept) Before1 Before2 Before3 Before4 After.New1
0.44 0.36 0.63 0.53 0.67 0.86
After.New2 Morph.Type1 Morph.Type2 Stress1 Phoneme1 Center.Age
0.24 0.60 0.13 0.31 0.57 0.50
Sex1 Education1 Education2
0.47 0.46 0.66
```

```
# Re-create td.glmer with all parameters with
# reversed factor orders
<- glmer(Dep.Var ~ fct_rev(Before) +
td.glmer.reversed fct_rev(After.New) + fct_rev(Morph.Type) + fct_rev(Stress) +
fct_rev(Phoneme) + Center.Age + fct_rev(Sex) +
+ (1 | Speaker), data = td, family = "binomial",
Education control = glmerControl(optCtrl = list(maxfun = 20000),
optimizer = "bobyqa"))
# Get probabilities for all estimates in
# td.glmer.reversed. Just looking at the first
# value (which corresponds to the final/missing
# value in td.glmer)
plogis(fixef(td.glmer.reversed))
```

```
(Intercept) fct_rev(Before)1 fct_rev(Before)2
0.44 0.31 0.67
fct_rev(Before)3 fct_rev(Before)4 fct_rev(After.New)1
0.53 0.63 0.34
fct_rev(After.New)2 fct_rev(Morph.Type)1 fct_rev(Morph.Type)2
0.24 0.81 0.13
fct_rev(Stress)1 fct_rev(Phoneme)1 Center.Age
0.69 0.43 0.50
fct_rev(Sex)1 Education1 Education2
0.53 0.46 0.66
```

These values are not the overall probability for each level, but rather centred probability/factor weights. An estimate of \(0\) log odds (\(0.50\) probability) indicates the likelihood/probability for tokens of that predictor level is equal to the overall likelihood `(Intercept)`

. To get the actual probability for a given level, you have to add its estimate to the `(Intercept)`

. The overall likelihood for `Female`

(e.g.,`Sex1`

) tokens is thus \(-0.38\) log odds or \(41\%\).

```
# Add the estimate for Sex1 to the estimate for
# (Intercept)
sum(fixef(td.glmer)["Sex1"], fixef(td.glmer)["(Intercept)"])
```

`[1] -0.38`

```
# Convert the sum of the estimates for Sex1 and
# (Intercept) to probability
inv.logit(sum(fixef(td.glmer)["Sex1"], fixef(td.glmer)["(Intercept)"]))
```

`[1] 0.41`

Returning now to the `summary(td.glmer)`

, in the second and third columns of the fixed effects, the standard error and \(z\) value are reported. Both are used to calculate the estimate. Whether the difference in likelihood represented by the estimate/coefficient for each level is significantly different from zero (i.e., equal to the overall likelihood `(Intercept)`

) is also calculated using the standard error and is reported in the fourth column. The `Pr(>|z|)`

value is the probability that this difference is equal to zero. The asterisks indicate whether this probability is lower than increasingly smaller thresholds. Generally, in the humanities and social sciences we use \(p>0.05\) as our significance threshold,^{3} so anything with at least one asterisk is considered significant. For the levels of `Before`

, the coefficients for `Liquid`

(`Before1`

), `Nasal`

(`Before1`

), and `S`

(`Before4`

) are significantly different from zero. In other words, the likelihood of `Deletion`

for these tokens is significantly different from the baseline. This is not the case for `Other Fricative`

(`Before3`

) tokens. For the “missing” level, `Stop`

, you know that the coefficient/estimate is `-0.8020349596`

which is a greater negative number than the estimate for `Before1`

, so you can infer that this difference must also be significant. To verify you can reorder the levels of `Before`

such that `Stop`

is no longer the last factor. You can do this by creating a new column with reordered factors, or you can use the `fct_rev()`

function to do the same inside the `glmer()`

formula.

```
# Re-order Before in reverse alphabetical order
$Before.Reorder <- factor(td$Before, levels = c("Stop",
td"S", "Other Fricative", "Nasal", "Liquid"))
# Re-create td.glmer with reordered Before
<- glmer(Dep.Var ~ Before.Reorder +
td.glmer.reorder + Morph.Type + Stress + Phoneme + Center.Age +
After.New + Education + (1 | Speaker), data = td, family = "binomial",
Sex control = glmerControl(optCtrl = list(maxfun = 20000),
optimizer = "bobyqa"))
# Alternative method
<- glmer(Dep.Var ~ fct_rev(Before) +
td.glmer.reorder + Morph.Type + Stress + Phoneme + Center.Age +
After.New + Education + (1 | Speaker), data = td, family = "binomial",
Sex control = glmerControl(optCtrl = list(maxfun = 20000),
optimizer = "bobyqa"))
summary(td.glmer.reorder)
```

```
Generalized linear mixed model fit by maximum likelihood (Laplace
Approximation) [glmerMod]
Family: binomial ( logit )
Formula: Dep.Var ~ fct_rev(Before) + After.New + Morph.Type + Stress +
Phoneme + Center.Age + Sex + Education + (1 | Speaker)
Data: td
Control: glmerControl(optCtrl = list(maxfun = 20000), optimizer = "bobyqa")
AIC BIC logLik deviance df.resid
1111 1192 -540 1079 1173
Scaled residuals:
Min 1Q Median 3Q Max
-5.082 -0.494 -0.255 0.488 15.060
Random effects:
Groups Name Variance Std.Dev.
Speaker (Intercept) 0.646 0.804
Number of obs: 1189, groups: Speaker, 66
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.25579 0.20213 -1.27 0.20571
fct_rev(Before)1 -0.80205 0.18917 -4.24 2.2e-05 ***
fct_rev(Before)2 0.72072 0.19015 3.79 0.00015 ***
fct_rev(Before)3 0.10216 0.27866 0.37 0.71391
fct_rev(Before)4 0.54284 0.19374 2.80 0.00508 **
After.New1 1.83918 0.15736 11.69 < 2e-16 ***
After.New2 -1.16821 0.14440 -8.09 6.0e-16 ***
Morph.Type1 0.42345 0.14017 3.02 0.00252 **
Morph.Type2 -1.88255 0.21360 -8.81 < 2e-16 ***
Stress1 -0.79289 0.13744 -5.77 8.0e-09 ***
Phoneme1 0.28047 0.12770 2.20 0.02807 *
Center.Age 0.00579 0.00844 0.69 0.49294
Sex1 -0.12256 0.15040 -0.81 0.41511
Education1 -0.17891 0.18183 -0.98 0.32514
Education2 0.64733 0.27528 2.35 0.01870 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
```

```
Correlation matrix not shown by default, as p = 15 > 12.
Use print(x, correlation=TRUE) or
vcov(x) if you need it
```

You can see above that the coefficient/estimate for `Before.Reorder`

, which you know is `Stop`

, is `-0.80205`

— nearly identical to what you calculated (the difference is due to rounding). You can see based on the value for `Pr(.|z|)`

that \(p=2.2\times 10^{-5}\), which is definitely lower than \(0.05\), i.e., significant.

For sum contrast coding, the `Pr(>|z|)`

value for the `(Intercept)`

tells you whether the baseline likelihood is significantly different from \(0\) — but remember, \(0\) log odds is equivalent to a probability of \(50\%\) or a \(50/50\) chance of a token being `Deletion`

. For the intercept here, the value is \(-0.277\) log odds (or \(44\%\) probability), which the model can’t verify as being statistically significantly different from \(0\) log odds (\(50\%\) probability).

Following the fixed effects there is usually a matrix of correlations. With many predictors or with predictors with many levels this correlation matrix can be very large. If the matrix is too large *R* will not print it automatically. Don’t worry too much about the correlation matrix right now. We will return to it in Part 3.

A useful way to visualize the fixed effects is with the function `plot_model()`

from the `sjPlot`

and affiliated packages. You should have `ggplot2`

already installed if you’ve been following along.

```
# Install sjPlot and affiliated pacakges
install.packages(c("sjPlot", "sjlabelled", "sjmisc"))
```

```
# Load required packages
library(sjPlot)
library(sjlabelled)
library(sjmisc)
library(ggplot2)
# Plot fixed effects
plot_model(td.glmer)
```

In the plot the default *x*-axis is transformed to odds ratios. You’ll remember that odds ratios are mathematically equivalent to both log odds and probabilities. To show either of these in plot, you can use the `transform=`

option, `NULL`

(no transformation) for log odds and `"plogis"`

for probabilities.

```
# Plot fixed effects with log odds as the x-axis
plot_model(td.glmer, transform = NULL)
```

```
# Plot fixed effects with probabilities as the
# x-axis
plot_model(td.glmer, transform = "plogis")
```

You’ll see that for the log odds plot the values are centered around \(0\) (no effect), which is equivalent to 1 odds ratio in the odds ratio plot, or \(0.50\) probability in the probability plot. The dots represent the estimate of the fixed effects. The lines extending to the right and left of the dots represent the bounds of the standard error. If the standard error does not cross the center line then the effect is statistically significant. The red dots in the log odds and odds ratio plots indicate values below the center line, red values indicate values below the center line. In the probability plot the values are all unfortunately red. As with the output of the sum contrast `glmer()`

model, there is also unfortunately one “missing” value for each predictor.

You can show the estimate values using the option `show.values = TRUE`

. Doing so also adds the significance asterisks (which can be suppressed, if desired, with `show.p = FALSE`

). The values will be plotted directly on top of the points, so use `value.offset`

to adjust the relative positioning. You can also highlight the center line with the `vline.color`

option.

```
# Plot fixed effects with log odds as the x-axis,
# estimates and significance showing, and
# highlighted center line
plot_model(td.glmer, transform = NULL, show.values = TRUE,
value.offset = 0.3, vline.color = "black")
```

You can see that all error bars that cross the center line are not significant. You can sort the individual levels of the predictors from most favouring to least favouring using the option `sort.est = TRUE`

. You can change the title using `title =`

. You can also make this graph readable in non-colored manuscripts using `color="bw"`

or `color = "gs"`

and employ some of the themes you encountered in previous chapters. Other tweaks to the plot can be found here

```
# Plot fixed effects with log odds as the x-axis,
# estimates and significance showing, highlighted
# center line, and sorted estimates
plot_model(td.glmer, transform = NULL, show.values = TRUE,
value.offset = 0.3, vline.color = "black", sort.est = TRUE,
title = "Likelihood of (t,d) deletion", colors = "gs") +
theme_classic()
```

While this type of plot may be less useful when reporting on a sum contrast regression analysis (as there are missing values), it is very useful when reporting on treatment contrast regression analyses. You can also plot the random effects per `Speaker`

by using the option `type = "rf"`

. This provides similar information as you extracted from the `glmer()`

model in Section 1.2.1. To sort this plot by random effect estimate you also need to add `grid = FALSE`

.

```
# Plot random effects with log odds as the
# x-axis, estimates and significance showing,
# highlighted center line, and sorted estimates
plot_model(td.glmer, type = "re", transform = NULL,
vline.color = "black", sort.est = "sort.all", grid = FALSE,
title = "Random effect per Speaker") + theme_classic()
```

```
Warning in checkMatrixPackageVersion(): Package version inconsistency detected.
TMB was built with Matrix version 1.4.1
Current Matrix version is 1.5.3
Please re-install 'TMB' from source using install.packages('TMB', type = 'source') or ask CRAN for a binary version of 'TMB' matching CRAN's 'Matrix' package
```

The above plot shows that, based on the standard error, only five speakers have baseline likelihoods of `Deletion`

significantly different from the overall intercept: `GARF37`

and `LATF53`

have a significantly lower baseline likelihood of `Deletion`

, while `VICF91`

, `MARM92`

, `NATF84`

, `INGM84`

and `HAWM90`

have a significantly higher baseline likelihood of deletion.

## Determining Significance and Magnitude of Effect

Let’s return now to the **three lines of evidence**. Does the model tell you which factors groups are significant predictors of the dependent variable? The answer: sort of. It tells you which levels of certain predictors are significantly different from the baseline, but this isn’t the same thing as signalling which predictors, collectively, create the best (e.g., most explanatory) model of the variation — the way *Goldvarb*’s step-up/step-down model does. In other words, you aren’t provided with the first two lines of evidence. You can figure out the third line of evidence, constraint hierarchy, but this would be the constraint hierarchy in what could conceivably be an overstuffed model. What you need is a tool to determine which factors **should** be in the model — or, rather, which factors actually explain the variation and which factors are erroneous (see **Which Model is best?** above). For this you can use the Wald \(\chi^2\) (*chi* [kaj] *square*) test. The Wald \(\chi^2\) test iteratively adds and removes each factor group/predictor, known as a parameter of the model, and compares how well each iteration fits the distribution of the data. If a parameter is found to be significant, it is interpreted as adding explanatory value. If a parameter is not significant, its contribution is superfluous to the understanding of the data and can be set aside. In this way, the Wald \(\chi^2\) test is very similar to the step-up/step-up down procedure implemented by *Goldvarb*. The result of the Wald \(\chi^2\) test reveals what combination of original parameters make the most parsimonious (‘sparse’) model, or rather, a group of original factors that only includes those that contribute significantly to predicting the variation.

The Wald \(\chi^2\) test is part of the `car`

package. The function, `Anova()`

is performed on an object, in this case `td.glmer`

, which is the result of a previously-performed logistic regression. Be careful, though! There is another function `anova()`

, which does not perform the Wald \(\chi^2\) test and is instead used for comparing different models.

```
# Wald Chi-Square test of most parsimonious model
library(car)
Anova(td.glmer)
```

```
Analysis of Deviance Table (Type II Wald chisquare tests)
Response: Dep.Var
Chisq Df Pr(>Chisq)
Before 38.67 4 8.1e-08 ***
After.New 147.85 2 < 2e-16 ***
Morph.Type 77.77 2 < 2e-16 ***
Stress 33.28 1 8.0e-09 ***
Phoneme 4.82 1 0.028 *
Center.Age 0.47 1 0.493
Sex 0.66 1 0.415
Education 5.60 2 0.061 .
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
```

The results of the Wald \(\chi^2\) test gives you the first line of evidence. They show you which factor groups, or parameters, add explanatory value to the model and which don’t. This is functionally equivalent to the selection of significant factors in a step-up/step-down procedure. The results also tell you the relative magnitude of effect of each parameter. The larger the \(\chi^2\) statistic (`Chisq`

), the greater magnitude of effect. Using \(p>0.05\) as the cut-off you see that `Before`

, `After.New`

, `Morph.Type`

, `Stress`

, and `Phoneme`

all add explanatory value. `Centre.Age`

, `Sex`

, and `Education`

do not (unsurprising given the results of the Random Forest analysis). This means that the finding that there is a division between men and women, and among men between those born before and after 1990 (as suggested by the Conditional Inference Tree analysis), is in fact not real once you take the linguistic factors and the random effect of speaker into account. Put another way, you do not have statistical validation for the observed trend in the summary statistics. In the Wald \(\chi^2\) results, `After.New`

has the largest \(\chi^2\) value (\(147.85\)) indicating it has the largest magnitude of effect on the variation. This is functionally equivalent to saying that its factor weights have the largest range. In descending order you then have `Morph.Type`

(\(\chi^2 = 77.77\)), `Before`

(\(\chi^2 = 38.67\)), `Stress`

(\(\chi^2 = 33.28\)), and `Phoneme`

(\(\chi^2 = 4.82\)).

Here is how you might represent these results in a manuscript:

The last line of evidence is the constraint hierarchy, or rather, the order of constraints from most favouring to least favouring. This last line of evidence in *Goldvarb* requires factor weights. Specifically, it requires the factor weights from the best step-up model and best step-down model — which should match. To re-create the equivalent model you simply create the most parsimonious model identified by the Wald \(\chi^2\) test. Here, that is a model constructed with only `After.New`

, `Morph.Type`

, `Before`

, `Stress`

, and `Phoneme`

.

```
# Most Parsimonious Model: Generalized linear
# mixed effects model with the fixed main effects
# of Before, After.New, Morph.Type, Stress,
# Phoneme, and the random effect of Speaker
library(lme4)
<- glmer(Dep.Var ~ After.New +
td.glmer.parsimonious + Before + Stress + Phoneme + (1 | Speaker),
Morph.Type data = td, family = "binomial", control = glmerControl(optCtrl = list(maxfun = 20000),
optimizer = "bobyqa"))
summary(td.glmer.parsimonious)
```

```
Generalized linear mixed model fit by maximum likelihood (Laplace
Approximation) [glmerMod]
Family: binomial ( logit )
Formula: Dep.Var ~ After.New + Morph.Type + Before + Stress + Phoneme +
(1 | Speaker)
Data: td
Control: glmerControl(optCtrl = list(maxfun = 20000), optimizer = "bobyqa")
AIC BIC logLik deviance df.resid
1114 1175 -545 1090 1177
Scaled residuals:
Min 1Q Median 3Q Max
-5.223 -0.488 -0.259 0.495 14.033
Random effects:
Groups Name Variance Std.Dev.
Speaker (Intercept) 0.796 0.892
Number of obs: 1189, groups: Speaker, 66
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.277 0.207 -1.34 0.18034
After.New1 1.840 0.157 11.71 < 2e-16 ***
After.New2 -1.175 0.144 -8.14 4.1e-16 ***
Morph.Type1 0.426 0.140 3.05 0.00230 **
Morph.Type2 -1.892 0.213 -8.87 < 2e-16 ***
Before1 -0.575 0.202 -2.84 0.00447 **
Before2 0.526 0.193 2.72 0.00659 **
Before3 0.117 0.278 0.42 0.67370
Before4 0.731 0.190 3.85 0.00012 ***
Stress1 -0.799 0.137 -5.81 6.2e-09 ***
Phoneme1 0.287 0.128 2.25 0.02462 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Correlation of Fixed Effects:
(Intr) Aft.N1 Aft.N2 Mrp.T1 Mrp.T2 Befor1 Befor2 Befor3 Befor4
After.New1 0.064
After.New2 -0.104 -0.430
Morph.Type1 -0.434 0.203 -0.114
Morph.Type2 -0.051 -0.221 0.178 -0.376
Before1 -0.296 -0.223 0.293 0.052 0.429
Before2 -0.164 0.191 -0.094 -0.110 0.247 0.029
Before3 0.150 0.018 -0.060 0.319 -0.515 -0.421 -0.477
Before4 0.250 0.304 -0.431 -0.202 0.051 -0.311 -0.090 -0.274
Stress1 -0.434 -0.432 -0.064 0.050 0.097 0.056 0.125 -0.094 -0.250
Phoneme1 0.459 0.149 -0.307 -0.137 -0.265 -0.543 -0.263 0.149 0.438
Strss1
After.New1
After.New2
Morph.Type1
Morph.Type2
Before1
Before2
Before3
Before4
Stress1
Phoneme1 -0.107
```

## Creating a Manuscript-ready Table

The estimates or coefficients give us the last line of evidence — and the last piece of statistical information that is generally reported in a standard *Goldvarb*-style manuscript table. Table 2 is such a table constructed using the information from the `td.glmer.parsimonious`

regression analysis.

The *Input* is the estimate of the intercept, converted to a probability using the `inv.logit()`

function. You can quickly get these values using `plogis(fixef(td.glmer.parsimonious))`

. The *Total N*, frequencies and *n* counts for each factor come from the summary statistics you performed earlier. The factor weights for each factor are that factor’s estimates converted to probabilities, again using the `inv.logit()`

or `plogis()`

function. Any mixed-effects model with a random effect should report the random effect. *Speaker* is listed as a random effect, and the dispersion among speakers is reported. As noted above, there is no consensus around whether to report the `Variance`

or `Std.Dev`

as the measure of this dispersion (remember standard deviation is simply the square root of the variance). Here I’ve reported standard deviation.

`plogis(fixef(td.glmer.parsimonious))`

```
(Intercept) After.New1 After.New2 Morph.Type1 Morph.Type2 Before1
0.43 0.86 0.24 0.61 0.13 0.36
Before2 Before3 Before4 Stress1 Phoneme1
0.63 0.53 0.68 0.31 0.57
```

The range for each factor group is the difference between the largest factor weight and the lowest factor weight expressed as a whole number. Notice that the ordering of magnitude of effect by the range of probabilities is slightly different from the ordering of magnitude of effect based on the \(\chi^2\) coefficient from the Wald \(\chi^2\) test and the ordering from the Random Forest analysis. For this reason it may be prudent to be very careful when using magnitude of effect/the second line of evidence to compare similarity/difference across data sets. Using multiple means to assess magnitude of effect is warranted, as is being very transparent about the means you use.

Many who create *Goldvarb*-style tables using data from either *Rbrul* or *R*’s `glmer()`

function report both the log odds and factor weights for a given factor (e.g., Drummond 2012, Tables 3-8; Becker 2014, Tables 5-6, etc.). I have not done so in Table 2 because reporting both is redundant: probability (factor weights), odds-ratios, and likelihood (log odds) are functionally the same, and one can be derived from the other mathematically. Finally, if you wanted to report the factor weights, proportions, and token counts for non-significant factors you could do so (of course, following conventions of the field by enclosing the factor weights in square brackets and not reporting the range) with values taken from the full (not most-parsimonious) model and the summary statistics. The full model is equivalent to the first model in a step-up/step-down analysis, or one-way analysis in *Goldvarb*.

While it may seem retrogressive to report the results of an `lme4`

analysis in the style of *Goldvarb*, presenting results in this fashion is highly readable and easily interpreted by other sociolinguistic researchers. Further, it is a succinct format for doing cross-model/data set comparisons. It also fulfills the requisites described by Gregory Guy in his *LVC guidelines for reporting quantitative results* [@-Guy2018]. For example, Table 2 compares the (t, d) deletion among young speakers with (t, d) deletion among middle/old speakers (see Modifying Data). It very easily shows how the three lines of evidence are both similar and different between the two age cohorts. Representing this comparison using raw `lme4`

/`glmer()`

outputs (or tables resembling this output) would be harder to read and thus less immediately interpretable.

From Table 3 you can observe several patterns. Firstly, the overall probability of `Deletion`

among young speakers is \(.41\) and among middle/old speakers is \(.34\). This indicates that `Deletion`

is more likely to occur among young speakers (though given that two measures of age are not significant when the data is combined suggests that this difference cannot be verified to be greater than chance variation). With respect to the first line of evidence (significance), you can see that for both age cohorts the same linguistic factors are significant predictors of the variation, indicating similar grammatical systems. Also important is that the same predictor, `Phoneme`

, is not significant, also indicating similar grammatical systems. `Gender`

is significant among middle/older speakers but not among younger speakers. This aligns with the findings from the Conditional Inference Tree, which shows that older men delete at a greater rate than everyone else. You can see some difference between cohorts when you consider magnitude of effect. For both cohorts following context (`After.New`

) has a greater magnitude of effect than stress or preceding context (`Before`

). Morpheme type (`Morph.Type`

), however, has a greater magnitude of effect among middle/older speakers relative to other predictors, while among younger speakers morpheme type has a lesser magnitude of effect compared to following context. This would be a pertinent finding to discuss in your manuscript. For the third line of evidence, constraint hierarchy, both cohorts have the same ranking of predictor levels for morpheme type, following context, stress, and, for the most part, preceding context. The one difference is preceding /s/, which highly favours deletion among younger speakers, but slightly disfavours deletion among older speakers. The one disadvantage of this *Goldvarb*-style table is that it does not show the individual, per-level significance measures. Looking at `td.glmer.not.young`

below shows that the probability of `Deletion`

among middle/older speakers’ preceding /s/ tokens is not statistically different from the mean. In other words, predicting /s/ is a strong favouring predictor of `Deletion`

among young speakers, but an inconsequential predictor among middle/older speakers. When examining the `glmer()`

outputs below, preceding /s/ is `Before4`

. What the output of `td.glmer.not.young`

also shows is that preceding other fricatives are also not significantly different from the mean. This suggests that for middle/older speakers /s/ and other fricatives behaving similarly, while for younger speakers /s/ and other fricatives do not behave similarly. We will delve into this phenomenon in Part 4.

```
# Subset data
<- td %>%
td.young subset(Age.Group == "Young")
<- td %>%
td.not.young subset(Age.Group != "Young")
# Create young speaker regression model
<- glmer(Dep.Var ~ After.New + Morph.Type +
td.glmer.young + Stress + Phoneme + (1 | Speaker), data = td.young,
Before family = "binomial", glmerControl(optCtrl = list(maxfun = 20000),
optimizer = "bobyqa"))
summary(td.glmer.young)
```

```
Generalized linear mixed model fit by maximum likelihood (Laplace
Approximation) [glmerMod]
Family: binomial ( logit )
Formula: Dep.Var ~ After.New + Morph.Type + Before + Stress + Phoneme +
(1 | Speaker)
Data: td.young
Control: glmerControl(optCtrl = list(maxfun = 20000), optimizer = "bobyqa")
AIC BIC logLik deviance df.resid
609 662 -292 585 616
Scaled residuals:
Min 1Q Median 3Q Max
-4.093 -0.488 -0.287 0.487 6.006
Random effects:
Groups Name Variance Std.Dev.
Speaker (Intercept) 0.944 0.971
Number of obs: 628, groups: Speaker, 31
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.2631 0.2891 -0.91 0.363
After.New1 1.6858 0.2189 7.70 1.3e-14 ***
After.New2 -1.2779 0.1926 -6.64 3.2e-11 ***
Morph.Type1 0.2434 0.1780 1.37 0.172
Morph.Type2 -1.4470 0.2587 -5.59 2.2e-08 ***
Before1 -0.4411 0.2571 -1.72 0.086 .
Before2 0.6435 0.2687 2.40 0.017 *
Before3 -0.0946 0.3723 -0.25 0.799
Before4 1.1065 0.2445 4.53 6.0e-06 ***
Stress1 -0.9500 0.1788 -5.31 1.1e-07 ***
Phoneme1 0.1545 0.1728 0.89 0.371
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Correlation of Fixed Effects:
(Intr) Aft.N1 Aft.N2 Mrp.T1 Mrp.T2 Befor1 Befor2 Befor3 Befor4
After.New1 0.071
After.New2 -0.139 -0.556
Morph.Type1 -0.298 0.251 -0.171
Morph.Type2 -0.089 -0.205 0.199 -0.339
Before1 -0.246 -0.132 0.255 -0.025 0.320
Before2 -0.128 0.237 -0.159 -0.124 0.238 0.070
Before3 0.213 -0.053 -0.019 0.241 -0.414 -0.439 -0.488
Before4 0.126 0.359 -0.489 -0.031 0.049 -0.251 -0.018 -0.257
Stress1 -0.394 -0.378 0.092 0.004 0.088 0.045 -0.039 -0.085 -0.269
Phoneme1 0.367 0.111 -0.249 0.041 -0.267 -0.467 -0.371 0.218 0.351
Strss1
After.New1
After.New2
Morph.Type1
Morph.Type2
Before1
Before2
Before3
Before4
Stress1
Phoneme1 0.009
```

```
# Create middle/old speaker regression model
<- glmer(Dep.Var ~ After.New + Morph.Type +
td.glmer.not.young + Stress + Phoneme + (1 | Speaker), data = td.not.young,
Before family = "binomial", glmerControl(optCtrl = list(maxfun = 20000),
optimizer = "bobyqa"))
summary(td.glmer.not.young)
```

```
Generalized linear mixed model fit by maximum likelihood (Laplace
Approximation) [glmerMod]
Family: binomial ( logit )
Formula: Dep.Var ~ After.New + Morph.Type + Before + Stress + Phoneme +
(1 | Speaker)
Data: td.not.young
Control: glmerControl(optCtrl = list(maxfun = 20000), optimizer = "bobyqa")
AIC BIC logLik deviance df.resid
508 560 -242 484 549
Scaled residuals:
Min 1Q Median 3Q Max
-2.640 -0.467 -0.158 0.482 25.237
Random effects:
Groups Name Variance Std.Dev.
Speaker (Intercept) 0.745 0.863
Number of obs: 561, groups: Speaker, 35
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.5123 0.3483 -1.47 0.14136
After.New1 2.0614 0.2532 8.14 3.9e-16 ***
After.New2 -0.9225 0.2446 -3.77 0.00016 ***
Morph.Type1 0.8327 0.2706 3.08 0.00209 **
Morph.Type2 -2.6850 0.4202 -6.39 1.7e-10 ***
Before1 -0.8360 0.3555 -2.35 0.01869 *
Before2 0.5781 0.3120 1.85 0.06392 .
Before3 0.7617 0.4883 1.56 0.11875
Before4 0.0656 0.3427 0.19 0.84817
Stress1 -0.7591 0.2633 -2.88 0.00394 **
Phoneme1 0.2901 0.2159 1.34 0.17899
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Correlation of Fixed Effects:
(Intr) Aft.N1 Aft.N2 Mrp.T1 Mrp.T2 Befor1 Befor2 Befor3 Befor4
After.New1 0.136
After.New2 -0.019 -0.139
Morph.Type1 -0.624 0.091 -0.059
Morph.Type2 0.073 -0.225 0.146 -0.491
Before1 -0.356 -0.288 0.296 0.136 0.499
Before2 -0.151 0.156 -0.059 -0.116 0.247 -0.018
Before3 -0.075 0.061 -0.083 0.516 -0.669 -0.355 -0.449
Before4 0.470 0.204 -0.319 -0.478 0.111 -0.381 -0.144 -0.350
Stress1 -0.484 -0.557 -0.327 0.159 0.065 0.078 0.218 -0.034 -0.253
Phoneme1 0.589 0.206 -0.308 -0.375 -0.191 -0.622 -0.126 -0.018 0.567
Strss1
After.New1
After.New2
Morph.Type1
Morph.Type2
Before1
Before2
Before3
Before4
Stress1
Phoneme1 -0.200
```

Despite how useful a *Goldvarb*-style table is, it is not the only way to report the results you’ve produced. Nor is the estimate the only information you have. There are other interesting values in your `summary(td.glmer)`

output. Going left to right, after the estimate there is the standard error, the \(z\)-value and the \(p\)-value. According to Guy (2018), reporting the estimates, standard errors, and significance is desirable. Whether reporting the \(z\)-scores is required is unclear. Table 4 reports `td.glmer`

in a format more similar to the `lme4`

output. The likelihoods in Table 4 are presented in log odds. They correspond exactly to the probabilities in Table 2. One addition to the information in the `lme4`

output included in Table 4 is the Observation columns. It is very important to report these distributions by factor/parameter level, preferably in your table, or somewhere else in your manuscript.

You also may want to report in your table additional measures of model fit (\(R^2\)) and whether the model is an improvement over the null model.

### References

*Statistical methods for research workers*. Edinburgh: Oliver; Boyd.

*Social Sciences*. 4(2). 361–372.

## Footnotes

Equivalent to \(-2 \times\)

`logLik`

↩︎Equal to the sample size (e.g., the number of tokens, \(1189\)) minus the number of parameters being estimated in the model (levels of the fixed effect predictors plus the intercept).↩︎

As is generally the convention since Fisher (1925). Here the \(p\)-value represents the probability of obtaining the same observation (here, estimate for a parameter) if the null hypothesis (here, that the difference of the estimate for a parameter and the intercept was actually null) were true. \(p>0.05\) means that there is less than 5% probability that an value as extreme (or more extreme) would be observed. This corresponds to allowing as much as about two standard deviations of acceptable variation due to random chance before rejecting the null hypothesis. Said another way, this threshold means the null hypothesis will be false at least \(19\) times out of \(20\). See also Thron & Miller (2015).↩︎

## Reuse

## Citation

```
@online{gardner,
author = {Gardner, Matt Hunt},
title = {Mixed-Efects {Logistic} {Regression} {Analysis:} {Part} 2},
series = {Linguistics Methods Hub},
volume = {Doing LVC with R},
date = {},
url = {https://lingmethodshub.github.io/content/R/lvc_r/112_lvcr.html},
doi = {10.5281/zenodo.7160718},
langid = {en}
}
```

*Linguistics Methods Hub*:

*Doing LVC with R*. (https://lingmethodshub.github.io/content/R/lvc_r/112_lvcr.html). doi: 10.5281/zenodo.7160718