Functional Form and Interactions

POLSCI 630: Probability and Basic Regression

February 18, 2025

Non-linear relationships

We can model non-linear relationships using linear regression

Logarithms
Polynomials
Interactions

But! Interpretation gets trickier in these cases.

Part 1. Logs

ln and exp()

Useful to remember:

\(\text{exp}(a+b) = e^{a+b} = e^ae^b\)
\(\text{exp}(a-b) = e^{a-b} = \frac{e^a}{e^b}\)
\(\text{ln}(a) + \text{ln}(b) = \text{ln} \left( ab \right)\)
\(\text{ln}(a) - \text{ln}(b) = \text{ln} \left( \frac{a}{b} \right)\)

Logarithmic functional forms

A constant (linear) effect in log of \(x\) is non-linear in \(x\) itself

Consider the following model: \(\text{ln}(y_i) = 10 + x_i + u_i\)
Every unit increase in \(x_i\) is associated with a 1-unit increase in \(\text{ln}(y_i)\)

But a 1-unit increase in \(\text{ln}(y_i)\) implies a non-constant effect in \(y_i\) itself

Order of magnitude increases in \(y_i\) are associated with constant increases in \(\text{ln}(y_i)\)

# order of mag in y constant in ln(y)
log(100) - log(10)

[1] 2.302585

log(1000) - log(100)

[1] 2.302585

log(10000) - log(1000)

[1] 2.302585

# ln(x) - ln(y) = ln(x / y)
log(10000 / 1000)

[1] 2.302585

log(10)

[1] 2.302585

Example data

As an example, let’s explore the salaries of professors in the US using carData::Salaries

        rank     discipline yrs.since.phd    yrs.service        sex     
 AsstProf : 67   A:181      Min.   : 1.00   Min.   : 0.00   Female: 39  
 AssocProf: 64   B:216      1st Qu.:12.00   1st Qu.: 7.00   Male  :358  
 Prof     :266              Median :21.00   Median :16.00               
                            Mean   :22.31   Mean   :17.61               
                            3rd Qu.:32.00   3rd Qu.:27.00               
                            Max.   :56.00   Max.   :60.00               
     salary      
 Min.   : 57800  
 1st Qu.: 91000  
 Median :107300  
 Mean   :113706  
 3rd Qu.:134185  
 Max.   :231545

It is reasonable to think that the effect of a variable (e.g., years of service) on salary is increasing in salary, but constant in a multiplier of salary

# 2 salaries in log units: 11 and 12
exp(11)

[1] 59874.14

exp(12)

[1] 162754.8

# let's say for every 2 years of service each gets a raise of 0.1 in log-income
exp(11.1) - exp(11)

[1] 6297.018

exp(12.1) - exp(12)

[1] 17117.07

# ratio is constant
exp(11.1) / exp(11)

[1] 1.105171

exp(12.1) / exp(12)

[1] 1.105171

exp(0.1)

[1] 1.105171

Log-level model

Let’s say we have the following model: \(\text{ln}(y_i) = \boldsymbol{x}_i' \boldsymbol{\beta} + u_i\)

\(y_i = e^{\boldsymbol{x}_i' \boldsymbol{\beta}} e^{u_i}, \quad \text{E}(y_i) = \text{E}(e^{\boldsymbol{x}_i' \boldsymbol{\beta}} e^{u_i})\)

\(= e^{\boldsymbol{x}_i' \boldsymbol{\beta}} \text{E}(e^{u_i})\)

If \(u_i \sim N(0, \sigma^2), \quad \text{E}(e^{u_i}) = e^{\frac{\sigma^2}{2}}\)

\(\text{E}(y_i) = e^{\boldsymbol{x}_i' \boldsymbol{\beta}} e^{\frac{\sigma^2}{2}}\)

Mean-zero error doesn’t drop out!

Example log-level

m1 <- lm(log(salary) ~ yrs.since.phd + sex + discipline, 
         data = carData::Salaries)
summary(m1)


Call:
lm(formula = log(salary) ~ yrs.since.phd + sex + discipline, 
    data = carData::Salaries)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.84527 -0.15697 -0.00855  0.15550  0.62670 

Coefficients:
               Estimate Std. Error t value Pr(>|t|)    
(Intercept)   1.125e+01  4.189e-02 268.671  < 2e-16 ***
yrs.since.phd 9.584e-03  9.075e-04  10.560  < 2e-16 ***
sexMale       6.639e-02  3.830e-02   1.734   0.0838 .  
disciplineB   1.447e-01  2.319e-02   6.239 1.14e-09 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.2244 on 393 degrees of freedom
Multiple R-squared:  0.2615,    Adjusted R-squared:  0.2559 
F-statistic: 46.39 on 3 and 393 DF,  p-value: < 2.2e-16

\(\text{E}(y_i)\) assuming normality

Let’s calculate the expected value of \(y\) for a male professor, 10 years from PhD, from “theoretical” departments (dept = A)

# expected value of ln(y)
m1_p1 <- predict(m1, newdata = data.frame(yrs.since.phd = 10,
                                 sex = "Male",
                                 discipline = "A")
)
m1_p1

       1 
11.41709

# expected value of y
exp(m1_p1) * exp(summary(m1)$sigma^2 / 2)

       1 
93179.02

\(\text{E}(y_i)\), not assuming normality (smearing)

What if we do not assume normality in the original errors?

A consistent estimator of \(\text{E}(e^{u_i})\) is \(\frac{\sum_{i=1}^N e^{\hat{u}_i}}{N}\)

exp(m1_p1) * (sum(exp(m1$residuals)) / nrow(m1$model))

       1 
93134.68

smearing estimate is relatively efficient, but less so as error variance increases
but normal estimator may also be quite sensitive to departures from normality

Duan (1983). “Smearing Estimate”

Interpreting \(\beta\)

Let’s use a simple example:

\(\text{ln}(y_i) = \beta_0 + \beta_1x_{1i} + \beta_2x_{2i} + u_i\)

\(\text{E}(y_i) = e^{\beta_0}e^{\beta_1x_{1i}}e^{\beta_2x_{2i}}\text{E}(e^{u_i})\)

What is the expected change in \(y_i\) for a \(\Delta\) change in \(x_{1i}\)?

What happens to \(\text{E}(y_i)\) when
- \(e^{\beta_0}e^{\beta_1x_{1i}}e^{\beta_2x_{2i}}\text{E}(e^{u_i})\) changes to
- \(e^{\beta_0}e^{\beta_1(x_{1i} + \Delta x_{1i})}e^{\beta_2x_{2i}}\text{E}(e^{u_i})\)

Interpreting \(\beta\)

Let’s take the ratio

\[\frac{e^{\beta_0}e^{\beta_1(x_{1i} + \Delta x_{1i})}e^{\beta_2x_{2i}}\text{E}(e^{u_i})} {e^{\beta_0}e^{\beta_1x_{1i}}e^{\beta_2x_{2i}}\text{E}(e^{u_i})} = \frac{e^{\beta_1(x_{1i} + \Delta x_{1i})}}{e^{\beta_1x_{1i}}} = e^{\beta_1(\Delta x_{1i})}\]

A 1-unit change in \(x_{1i}\) multiplies \(\text{E}(y_i)\) by \(e^{\beta_1}\)
\(100(e^{\beta_1(\Delta x_{1i})} - 1)\) gives a % change \(\text{E}(y)\)

Example

# FACTOR CHANGE in y for 1-unit change in x (yrs.since)

exp(coef(m1)[2])

yrs.since.phd 
      1.00963

# PERCENT change in y for 1-unit change in x

(exp(coef(m1)[2]) - 1)*100

yrs.since.phd 
    0.9629591

# PERCENT change in y for 5-unit change in x

(exp(5*coef(m1)[2]) - 1)*100

yrs.since.phd 
     4.908422

Derivatives for marginal effects

Useful to remember:

first derivative of \(\text{ln}(x)\) is \(\frac{1}{x}\)
first derivative of \(e^{ax}\) is \(ae^{ax}\)

Marginal effects in \(y_i\)

\(\text{E}(y_i) = e^{\boldsymbol{x}_i' \boldsymbol{\beta}} \text{E}(e^{u_i})\)

Notice that the marginal effect of \(x_1\) is not constant anymore:

\(\frac{\partial \text{E}(y_i)}{\partial x_{1i}} = \beta_1 e^{\boldsymbol{x}_i' \boldsymbol{\beta}} \text{E}(e^{u_i}) = \beta_1\text{E}(y_i)\)

This partial derivative gives the slope in \(y_i\) units with respect to \(x_{1i}\)
The key point is that the slope is conditional on \(\boldsymbol{x}_i \boldsymbol{\beta}\)

Implications of conditionality

Since the marginal effect is conditional on the levels of the \(\boldsymbol{x}_i\):

The marginal effect changes when we hold the \(\boldsymbol{x}_i\) at different values
There are different marginal effects for each observation
More generally, there is a different marginal effect for all possible combinations of IVs

Strategies for reporting

If we want to report the marginal effect of a variable with a single number, there are two common strategies:

Fix all IVs to central tendencies
Calculate the average marginal effect across observations
- Calculate the marginal effect for each observation (fixing all variables at the values for that observation)
- Calculate the mean over these individual MEs

Example: fix to central tendencies

# B1
B1 <- coef(m1)["yrs.since.phd"]

# exp(X_B)
eXB <- exp(predict(m1, 
                   newdata = data.frame(yrs.since.phd = mean(carData::Salaries$yrs.since.phd),
                                        sex = "Male",
                                        discipline = "B"
                                        )
                   )
)

# calculate estimate of E(exp(u))
E_eu <- sum(exp(m1$residuals^2)) / nrow(m1$model)

# calculate ME for each observation
B1 * eXB * E_eu

yrs.since.phd 
      1194.31

Example: average marginal effect

## ME_i = B1 * exp(X_iB) * sum(exp(res_i^2)) / N

# B1
B1 <- coef(m1)["yrs.since.phd"]

# exp(X_B) [generate predicted values for all observations]
eXB <- exp(predict(m1))

# calculate estimate of E(exp(u))
E_eu <- sum(exp(m1$residuals^2)) / nrow(m1$model)

# calculate ME for each observation
mes_yrs <- B1 * eXB * E_eu

# calculate mean of MEs
mean(mes_yrs)

[1] 1120.686

Example: ME of years since PhD

Let’s say you want to see how the marginal effect of years since PhD changes as a function of itself (e.g., from 1 year to 20)

But there are also two other variables in the model which impact the ME
Each is binary, so there are actually 4 distinct MEs at each possible value of years since PhD

Example: ME of years since PhD

# B1
B1 <- coef(m1)["yrs.since.phd"]

# "expand grid" to get all values of X vars cross w/ each other
values <- expand.grid(1:20, c("Female", "Male"), c("A","B"))

# exp(X_B) [generate predicted values for all observations, but vary]
eXB <- exp(predict(m1, 
                   newdata = data.frame(yrs.since.phd = values[, 1],
                                        sex= values[, 2],
                                        discipline = values[, 3])
                   )
)

# calculate estimate of E(exp(u))
E_eu <- sum(exp(m1$residuals^2)) / nrow(m1$model)

# calculate ME for each observation
mes_yrs <- B1 * eXB * E_eu

Example: ME of years since PhD

First differences for \(y_i\)

First difference gives difference in expected values of \(y_i\) for a discrete change in \(x_{1i}\)

The change will again depend on all variables in the model

Different options here

Move focal variable from a particular value to another particular value (e.g., 10 to 11)
- Fix all other variables to central tendencies
- Calculate average FD allowing each obs to keep its own values of other vars
Move focal variable by a fixed amount, allowing obs to keep own values of all vars

Example: holding other vars at central tendencies

# exp(X_B) low yrs
eXB_low <- exp(predict(m1, 
                       newdata = data.frame(yrs.since.phd = 10,
                                            sex = "Male",
                                            discipline = "B"
                                            )
                       )
)

# exp(X_B) high yrs
eXB_high <- exp(predict(m1, 
                        newdata = data.frame(yrs.since.phd = 11,
                                             sex = "Male",
                                             discipline = "B"
                                             )
                        )
)

# calculate estimate of E(exp(u))
E_eu <- sum(exp(m1$residuals^2)) / nrow(m1$model)

# calculate FD
eXB_high * E_eu - eXB_low * E_eu

       1 
1066.459

Example: everyone with their own values

# XB low (original values)
eXB_low <- exp(predict(m1))

# XB high (+1 from original values)
newdata <- m1$model
newdata$yrs.since.phd <- newdata$yrs.since.phd + 1
eXB_high <- exp(predict(m1, newdata = newdata))

# calculate estimate of E(exp(u))
E_eu <- sum(exp(m1$residuals^2)) / nrow(m1$model)

# calculate average FD
mean(eXB_high * E_eu - eXB_low * E_eu)

[1] 1126.074

Level-log model

\[y_i = \beta_0 + \beta_1 \text{ln}(x_{1i}) + \beta_2 \text{ln}(x_{2i}) +u_i\]

\(\frac{\Delta y_i}{\Delta x_{1i}} = \beta_1 \text{ln}(x_{1i}+\Delta x_{1i}) - \beta_1 \text{ln}(x_{1i}) = \beta_1 \text{ln}(\frac{x_{1i} + \Delta x_{1i}}{x_{1i}})\)

Expected change in \(y\) for \(p\)% change in \(x\) is \(\beta \text{ln}(\frac{100+p}{100})\)
Since \(\text{ln}(1.01) \approx 0.01\), for every 1% change in \(x\), we expect approximately a \(\frac{\beta_1}{100}\)-unit change in \(y\)

Example level-log

m2 <- lm(salary ~ log(yrs.since.phd) + sex + discipline,
         data = carData::Salaries)

summary(m2)


Call:
lm(formula = salary ~ log(yrs.since.phd) + sex + discipline, 
    data = carData::Salaries)

Residuals:
   Min     1Q Median     3Q    Max 
-73925 -16473  -2228  14208  93982 

Coefficients:
                   Estimate Std. Error t value Pr(>|t|)    
(Intercept)           39475       6171   6.397 4.50e-10 ***
log(yrs.since.phd)    20522       1629  12.598  < 2e-16 ***
sexMale                7476       4264   1.753   0.0804 .  
disciplineB           15961       2581   6.183 1.58e-09 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 25100 on 393 degrees of freedom
Multiple R-squared:  0.3186,    Adjusted R-squared:  0.3134 
F-statistic: 61.25 on 3 and 393 DF,  p-value: < 2.2e-16

Example level-log

# expected change in y for 10% change in x
# log(1.10) = log(110/100) = log(1.1/1)
coef(m2)[2]*log(1.10)

log(yrs.since.phd) 
          1955.961

# expected change in y for 50% change in x
coef(m2)[2]*log(1.50)

log(yrs.since.phd) 
           8320.98

Log-log model

The log-log model combines the previous two, and so all of the above applies!

\[\text{ln}(y_i) = \beta_0 + \beta_1 \text{ln}(x_{1i}) + \beta_2 \text{ln}(x_{2i}) + u_i\]

Interpreting \(\boldsymbol{\beta}\)

\(\text{E}(y_i) = \text{E}(e^{\beta_0 + \beta_1 \text{ln}(x_{1i}) + \beta_2 \text{ln}(x_{2i}) + u_i})\)

\(\frac{e^{\beta_0}e^{\beta_1 \left(\text{ln}(x_{1i} + \Delta x_{1i}) \right)} e^{\beta_2 \text{ln}(x_{2i})}\text{E}(e^{u_i})} {e^{\beta_0}e^{\beta_1 \left(\text{ln}(x_{1i}) \right)}e^{\beta_2 \text{ln}(x_{2i})}\text{E}(e^{u_i})} = e^{\beta_1 \text{ln} \left(\frac{x+\Delta x_{1i}}{x_{1i}} \right)}\)

The factor change in \(y\) for a \(p\)% change in \(x\) is \(e^{\beta \text{ln}(\frac{100+p}{100})}\)
We can thus roughly interpret \(\beta\) as the % change in \(y\) for a 1% change in \(x\)
- Works best for values close to 1, approximation is worse as move away from 1

exp(log(1.01))

[1] 1.01

exp(1*0.01)

[1] 1.01005

exp(5*0.01)

[1] 1.051271

exp(50*0.01)

[1] 1.648721

Example log-log

m3 <- lm(log(salary) ~ log(yrs.since.phd) + sex + discipline,
         data = carData::Salaries)

summary(m3)


Call:
lm(formula = log(salary) ~ log(yrs.since.phd) + sex + discipline, 
    data = carData::Salaries)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.76594 -0.13749  0.00164  0.14007  0.59995 

Coefficients:
                   Estimate Std. Error t value Pr(>|t|)    
(Intercept)        10.93192    0.05149 212.292  < 2e-16 ***
log(yrs.since.phd)  0.18557    0.01359  13.651  < 2e-16 ***
sexMale             0.06913    0.03558   1.943   0.0528 .  
disciplineB         0.14943    0.02154   6.937 1.66e-11 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.2094 on 393 degrees of freedom
Multiple R-squared:  0.3569,    Adjusted R-squared:  0.352 
F-statistic: 72.69 on 3 and 393 DF,  p-value: < 2.2e-16

Example log_log

# factor change in y for 10% change in x
exp(coef(m3)[2]*log(1.10))

log(yrs.since.phd) 
          1.017844

# % change in y for 10% change in x
(exp(coef(m3)[2]*log(1.10)) - 1)*100

log(yrs.since.phd) 
          1.784422

# % change in y for 50% change in x
(exp(coef(m3)[2]*log(1.50)) - 1)*100

log(yrs.since.phd) 
          7.814597

Part 2. Polynomial functions

Polynomial functions of \(\boldsymbol{x}\)

A non-linear relationship can also be specified via a polynomial function for IVs, e.g.:

\[y_i = \beta_0 + \beta_1x_{1i} + \beta_2x_{1i}^2 + \beta_3x_{2i} + u_i\]

The model is still linear in the parameters, but the relationship of \(x_{1i}\) to \(y\) now depends on \(x_{1i}\) itself

Plot

Marginal effect, quad

\[y_i = \beta_0 + \beta_1x_{1i} + \beta_2x_{1i}^2 + \beta_3x_{2i} + u_i\]

The marginal effect of \(x_{1i}\) is the first partial derivative:

\[\frac{\partial y_i}{\partial x_{1i}} = \beta_1 + 2\beta_2x_{1i}\]

it is a linear function of itself

Marginal effect, cubic

\[y_i = \beta_0 + \beta_1x_{1i} + \beta_2x_{1i}^2 + \beta_3x_{1i}^3 + \beta_4x_{2i} + u_i\]

The marginal effect of \(x_{1i}\) is the first partial derivative:

\[\frac{\partial y_i}{\partial x_{1i}} = \beta_1 + 2\beta_2x_{1i} + 3\beta_3x_{1i}^2\]

it is a quadratic function of itself

First difference

\[y_i = \beta_0 + \beta_1x_{1i} + \beta_2x_{1i}^2 + \beta_3x_{2i} + u_i\]

The first difference is:

\(\frac{\Delta y_i}{\Delta x_{1i}} = \left( \beta_1(x_{1i} + \Delta x_{1i}) + \beta_2(x_{1i} + \Delta x_{1i})^2 \right) - \left( \beta_1(x_{1i}) + \beta_2(x_{1i})^2 \right)\)

\(= \beta_1(\Delta x_{1i}) + \beta_2(\Delta x_{1i}^2 + 2x_{1i}\Delta x_{1i})\)

Example (vars scaled 0-1)

# load Lucid pub opinion data
lucid <- read.csv("data/Lucid_Data.csv", 
                  stringsAsFactors = F)

# estimate regression of left-right economic policy prefs on IVs
m4 <- lm(econ_mean ~ age_scale + I(age_scale^2) + 
           male + educ_scale + income_scale, 
         data = lucid)
summary(m4)


Call:
lm(formula = econ_mean ~ age_scale + I(age_scale^2) + male + 
    educ_scale + income_scale, data = lucid)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.43693 -0.13101 -0.00162  0.11975  0.69088 

Coefficients:
                Estimate Std. Error t value Pr(>|t|)    
(Intercept)     0.322067   0.011080  29.068  < 2e-16 ***
age_scale      -0.158514   0.049534  -3.200  0.00138 ** 
I(age_scale^2)  0.119425   0.056702   2.106  0.03524 *  
male            0.033077   0.005582   5.926 3.33e-09 ***
educ_scale      0.021022   0.010864   1.935  0.05304 .  
income_scale    0.120404   0.010602  11.356  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.1882 on 4684 degrees of freedom
  (58 observations deleted due to missingness)
Multiple R-squared:  0.05823,   Adjusted R-squared:  0.05723 
F-statistic: 57.93 on 5 and 4684 DF,  p-value: < 2.2e-16

Plot conditional marginal effect

Marginal effect of age_scale is \(\beta_1 + 2\beta_2age\)

Plot expected values

Part 3. Interactions

Interactions

An interaction is a situation in which the relationships of each of two IVs to a DV depend on each other

e.g., the relationship of policy support to vote choice may depend on political knowledge
We capture interactions with multiplicative terms: \(x_{ki}x_{li}\)
Typically important to include the constituent terms of the interaction as predictors (R will do this by default)

Constituent terms

\[y_i = \beta_0 + \beta_1x_{1i} + \beta_2x_{2i} + \beta_3x_{1i}x_{2i} + u_i\]

\(\beta_1\) and \(\beta_2\) are not “main effects” of their respective variables
They are the conditional effects of these variables when the moderator is set to 0
They are the intercept of the marginal effect equation

MEs and FDs

\[y_i = \beta_0 + \beta_1x_{1i} + \beta_2x_{2i} + \beta_3x_{1i}x_{2i} + u_i\]

The marginal effect of \(x_{1i}\) is the first partial derivative:

\[\frac{\partial y_i}{\partial x_{1i}} = \beta_1 + \beta_3x_{2i}\]

The first difference is:

\[\frac{\Delta y_i}{\Delta x_{1i}} = \beta_1\Delta x_{1i} + \beta_3\Delta x_{1i}x_{2i}\]

Example (all vars scaled 0-1)

# estimate regression of left-right economic policy prefs on IVs
m5 <- lm(econ_mean ~ age_scale + I(age_scale^2) + male + educ_scale + 
           income_scale*know_mean, 
         data = lucid)
summary(m5)


Call:
lm(formula = econ_mean ~ age_scale + I(age_scale^2) + male + 
    educ_scale + income_scale * know_mean, data = lucid)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.46091 -0.13112 -0.00257  0.11944  0.69693 

Coefficients:
                         Estimate Std. Error t value Pr(>|t|)    
(Intercept)             0.3157633  0.0135866  23.241  < 2e-16 ***
age_scale              -0.1561355  0.0495193  -3.153  0.00163 ** 
I(age_scale^2)          0.1232286  0.0566972   2.173  0.02980 *  
male                    0.0348407  0.0056419   6.175 7.16e-10 ***
educ_scale              0.0257784  0.0111371   2.315  0.02068 *  
income_scale            0.1561915  0.0227687   6.860 7.79e-12 ***
know_mean               0.0007747  0.0165371   0.047  0.96264    
income_scale:know_mean -0.0557548  0.0330965  -1.685  0.09213 .  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.1881 on 4682 degrees of freedom
  (58 observations deleted due to missingness)
Multiple R-squared:  0.05968,   Adjusted R-squared:  0.05827 
F-statistic: 42.45 on 7 and 4682 DF,  p-value: < 2.2e-16

Example (standardized)

# standardized
m5_s <- lm(econ_mean ~ age_scale + I(age_scale^2) + male + educ_scale + 
           scale(income_scale)*scale(know_mean), 
         data = lucid)
summary(m5_s)


Call:
lm(formula = econ_mean ~ age_scale + I(age_scale^2) + male + 
    educ_scale + scale(income_scale) * scale(know_mean), data = lucid)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.46091 -0.13112 -0.00257  0.11944  0.69693 

Coefficients:
                                      Estimate Std. Error t value Pr(>|t|)    
(Intercept)                           0.367927   0.011513  31.958  < 2e-16 ***
age_scale                            -0.156136   0.049519  -3.153  0.00163 ** 
I(age_scale^2)                        0.123229   0.056697   2.173  0.02980 *  
male                                  0.034841   0.005642   6.175 7.16e-10 ***
educ_scale                            0.025778   0.011137   2.315  0.02068 *  
scale(income_scale)                   0.036094   0.003114  11.591  < 2e-16 ***
scale(know_mean)                     -0.006602   0.002984  -2.212  0.02698 *  
scale(income_scale):scale(know_mean) -0.004767   0.002829  -1.685  0.09213 .  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.1881 on 4682 degrees of freedom
  (58 observations deleted due to missingness)
Multiple R-squared:  0.05968,   Adjusted R-squared:  0.05827 
F-statistic: 42.45 on 7 and 4682 DF,  p-value: < 2.2e-16

Expected values of `econ_mean`

Ey <- predict(m5, newdata = data.frame(
  
  # set non-focal variables at central tendencies
  age_scale = mean(lucid$age_scale, na.rm = T), 
  male = median(lucid$male, na.rm = T), 
  educ_scale = mean(lucid$educ_scale, na.rm = T),
                                       
  # vary two interacting variables across reasonable values of each
  income_scale = rep(seq(0, 1, 0.05), 3),
  know_mean = c(rep(0, 21), rep(0.5, 21), rep(1, 21))
  )
)

Expected values of `econ_mean`

Marginal effect

Marginal effect of income_scale is \(\beta_6 + \beta_8 \text{know_mean}\)

Another example

m1_a <- lm(log(salary) ~ discipline + yrs.since.phd*sex, 
         data = carData::Salaries)
summary(m1_a)


Call:
lm(formula = log(salary) ~ discipline + yrs.since.phd * sex, 
    data = carData::Salaries)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.83384 -0.15969 -0.00391  0.15314  0.62870 

Coefficients:
                       Estimate Std. Error t value Pr(>|t|)    
(Intercept)           11.140121   0.072366 153.942  < 2e-16 ***
disciplineB            0.143610   0.023116   6.213 1.33e-09 ***
yrs.since.phd          0.016568   0.003710   4.466 1.04e-05 ***
sexMale                0.191563   0.074931   2.557   0.0109 *  
yrs.since.phd:sexMale -0.007413   0.003819  -1.941   0.0530 .  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.2237 on 392 degrees of freedom
Multiple R-squared:  0.2685,    Adjusted R-squared:  0.2611 
F-statistic: 35.98 on 4 and 392 DF,  p-value: < 2.2e-16

And more complicated…

m1_b <- lm(log(salary) ~ yrs.since.phd*discipline + yrs.since.phd*sex, 
         data = carData::Salaries)
summary(m1_b)


Call:
lm(formula = log(salary) ~ yrs.since.phd * discipline + yrs.since.phd * 
    sex, data = carData::Salaries)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.81146 -0.16212 -0.01593  0.15491  0.63374 

Coefficients:
                           Estimate Std. Error t value Pr(>|t|)    
(Intercept)               11.167400   0.074302 150.297  < 2e-16 ***
yrs.since.phd              0.015553   0.003759   4.137 4.31e-05 ***
disciplineB                0.080141   0.046633   1.719  0.08649 .  
sexMale                    0.201332   0.075053   2.683  0.00762 ** 
yrs.since.phd:disciplineB  0.002804   0.001790   1.566  0.11812    
yrs.since.phd:sexMale     -0.007849   0.003822  -2.054  0.04067 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.2232 on 391 degrees of freedom
Multiple R-squared:  0.2731,    Adjusted R-squared:  0.2638 
F-statistic: 29.38 on 5 and 391 DF,  p-value: < 2.2e-16

Control variables and interactions

Alternative moderators of an interactive relationship need to be included as interactions

e.g., interaction of policy preferences with political knowledge may need to control for interaction with education
If moderator is categorical, can estimate separate models for each level with controls
If non-categorical, need to include other interactions

Linearity assumption

A simple interaction between two variables implies a linear function for the conditional marginal effect:

\[\frac{\partial y_i}{\partial x_{1i}} = \beta_1 + \beta_3x_{2i}\]

This may be a bad assumption!
And/or the estimate for \(\beta_3\) (the interaction term) may extend to regions of \(\bf{X}\) with little to no support
see package interflex

Example

https://doi.org/10.1093/poq/nfac004

Higher-order interactions

\[ \begin{aligned} y_i = \beta_0 &+ \beta_1x_{1i} + \beta_2x_{2i} + \beta_3x_{3i} \\ &+ \beta_4x_{1i}x_{2i} + \beta_5x_{1i}x_{3i} + \beta_6x_{2i}x_{3i}\\ &+ \beta_7x_{1i}x_{2i}x_{3i} \\ &+ u_i \end{aligned} \]

In practice, need strong theory and a lot of data to warrant.

\[\frac{\partial y_i}{\partial x_{1i}} = \beta_1 + \beta_4x_{2i} + \beta_5x_{3i} + \beta_7x_{2i}x_{3i}\]

Functional Form and Interactions

Non-linear relationships

Part 1. Logs

ln and exp()

Logarithmic functional forms

Example data

Log-level model

Example log-level

\(\text{E}(y_i)\) assuming normality

\(\text{E}(y_i)\), not assuming normality (smearing)

Interpreting \(\beta\)

Interpreting \(\beta\)

Example

Derivatives for marginal effects

Marginal effects in \(y_i\)

Implications of conditionality

Strategies for reporting

Example: fix to central tendencies

Example: average marginal effect

Example: ME of years since PhD

Example: ME of years since PhD

Example: ME of years since PhD

First differences for \(y_i\)

Example: holding other vars at central tendencies

Example: everyone with their own values

Level-log model

Example level-log

Example level-log

Log-log model

Interpreting \(\boldsymbol{\beta}\)

Example log-log

Example log_log

Part 2. Polynomial functions

Polynomial functions of \(\boldsymbol{x}\)

Plot

Marginal effect, quad

Marginal effect, cubic

First difference

Example (vars scaled 0-1)

Plot conditional marginal effect

Plot expected values

Part 3. Interactions

Interactions

Constituent terms

MEs and FDs

Example (all vars scaled 0-1)

Example (standardized)

Expected values of econ_mean

Expected values of econ_mean

Marginal effect

Another example

And more complicated…

Control variables and interactions

Linearity assumption

Example

Higher-order interactions

Expected values of `econ_mean`

Expected values of `econ_mean`