Ordered models are discrete distributions used to analyze dependent variables with more than two categories, where those categories have a natural order or ranking. Unlike binary models, which only accommodate two possible outcomes, ordered models allow for multiple ordered outcomes and are often employed in situations where the dependent variable is ordinal.
To estimate an ordered model, we typically use a latent variable interpretation of the dependent variable. The idea behind this interpretation is that while we would ideally measure the dependent variable on a continuous scale and express it as a linear function of one or more independent variables, in practice, the variable is only observed in a limited number of categories. This latent variable approach provides a convenient framework for modeling such outcomes.
The unobserved continuous measure is denoted as \(Y^*\) (a latent or underlying variable). The observed dependent variable, \(Y\), represents categories that each cover a specific range of this latent variable. Formally, given \(k\) categories and labeling the cutpoints that separate these categories as \(\tau_k\), \(Y\) can be defined as:
\[ Y = \begin{cases} 1 & \text{if} \quad Y^* < \tau_1 \\ 2 & \text{if} \quad \tau_1 \leq Y^* < \tau_2 \\ 3 & \text{if} \quad \tau_2 \leq Y^* < \tau_3 \\ \vdots & \\ k & \text{if} \quad Y^* \geq \tau_{k-1} \end{cases} \]
In this example, we simulate an ordered probit model, which assumes that the unobserved latent variable \(Y^*\) follows a normal distribution.
The systematic component of the model is common across all categories, meaning that the effect of a change in an independent variable \(X\) on the probability of a respondent moving from one category to the next on the observed variable \(Y\) is the same across all categories. This feature is known as the parallel probits assumption or proportional odds assumption (more on this in the next lab).
To implement the latent variable framework, we generate random samples of \(Y^*\) and then assign these simulated values of \(Y^*\) to different categories of \(Y\) based on the position of \(Y^*\) relative to the cutpoints. Therefore, defining these cutpoints is a critical step.
The location of the latent variable \(Y^*\) can be set in two common ways. One approach is to fix one of the cutpoints \(\tau_k\) at zero and estimate the intercept term \(\beta_0\) along with the other cutpoints relative to this fixed point. This method is often the default in binary model estimation routines.
The second approach, which we use here, is to fix the intercept at zero and estimate all cutpoints relative to it. Importantly, this choice does not affect the relative estimates and coefficients, but restricting either the intercept or a cutpoint to zero is necessary for identification.
Since the estimator fixes the intercept at zero, we must define the cutpoints in the data-generating process (DGP) to check if the estimator accurately recovers them. This requires defining the cutpoints relative to the expected mean and variance of \(Y^*\), based on its assumed probability distribution.
Since \(Y^*\) is a latent variable that we do not directly observe or measure, it lacks an inherent scale. To estimate the model, we must establish a scale for \(Y^*\), which involves defining both its location and variance.
Probability Distribution Assumption: we must first assume a probability distribution for \(Y^*\). In ordered models, this distribution is typically either the normal distribution (for ordered probit models) or the logistic distribution (for ordered logit models).
Variance of \(Y^*\): - For a standard normal distribution, the variance is equal to 1. - For a standard logistic distribution, the variance is \(\frac{\pi^2}{3}\).
This variance specification applies to the stochastic portion of \(Y^*\). Without a loss of generality, it could be scaled by a constant to introduce more or less variance if desired. Importantly, in binary logit and probit models, we also make these same assumptions about the variance of the stochastic term.
Below is the code to simulate an ordered probit model. In this example, we will simulate an ordered categorical variable with four discrete levels \(Y = {1, 2, 3, 4}\). You can think of these levels as survey responses or any other common application involving discrete ordered distributions.
# Simulation of Oredered Probit Model
set.seed(8732)
b0 <- 0 # true value of the intercept
b1 <- 0.5 # true value for the slope
n <- 10000 # sample size
# create independent variable
X <- rnorm(n, mean=0, sd=1)
# Systematic component
XB <- b0 + b1 * X
# SD of error of unobserved Y*
sd.error <- 1
# Define the true cutpoints
tau1 <- qnorm(0.1,
mean = mean(XB),
sd = sqrt(var(XB)) + sd.error^2)
tau2 <- qnorm(0.3,
mean = mean(XB),
sd = sqrt(var(XB)) + sd.error^2)
tau3 <- qnorm(0.8,
mean = mean(XB),
sd = sqrt(var(XB)) + sd.error^2)
# set the latent unobserved variable
Y.star <- rnorm(n, XB, sd.error)
# Define Y as a vector of NAs with length n
Y <- rep(NA, n)
# Create observed categorical outcomes Y
# Set Y equal to a value according to Y.star
Y <- dplyr::case_when(
Y.star < tau1 ~ 1,
Y.star >= tau1 & Y.star < tau2 ~ 2,
Y.star >= tau2 & Y.star < tau3 ~ 3,
Y.star >= tau3 ~ 4
)
# Save the data
dat <- tibble(Y.star = Y.star,
Y = Y,
X = X)
To understand the intuition behind the latent variable \(Y^*\), the observed categories \(Y\), and how the estimation maps the cutoff thresholds into the latent continuous space, we present some visualizations of the simulated data below.
Now, we can estimate an ordinal probit model using the polr function from the MASS package. This allows us to compare the estimated cutoff points and the marginal effect of \(X\) against the true parameters used in the simulation. While the estimated values may not perfectly match the true parameters (as is expected due to random variation in the sample), the estimates should be close, demonstrating the effectiveness of the model in recovering the underlying structure of the data.
# Estimate ordered model
model <- MASS::polr(as.ordered(Y) ~ X,
method = "probit",
Hess = TRUE,
data = )
summary(model)
## Call:
## MASS::polr(formula = as.ordered(Y) ~ X, Hess = TRUE, method = "probit")
##
## Coefficients:
## Value Std. Error t value
## X 0.5036 0.01229 40.97
##
## Intercepts:
## Value Std. Error t value
## 1|2 -1.8885 0.0242 -78.1950
## 2|3 -0.7741 0.0146 -52.9629
## 3|4 1.2570 0.0173 72.5066
##
## Residual Deviance: 18635.07
## AIC: 18643.07
# Compare estimates vs true
tibble(
parameter = c("tau1","tau2","tau3","X"),
estimate = c(round(as.numeric(model$zeta),4),b1),
true = c(round(tau1,4),round(tau2,4),round(tau3,4),
round(as.numeric(model$coefficients),4))
)
## # A tibble: 4 × 3
## parameter estimate true
## <chr> <dbl> <dbl>
## 1 tau1 -1.89 -1.91
## 2 tau2 -0.774 -0.775
## 3 tau3 1.26 1.28
## 4 X 0.5 0.504
Below, we revisit the example discussed in class on sexist attitudes toward working mothers. The dependent variable is an ordered categorical variable with four levels, ranging from “Strongly Disagree” to “Strongly Agree.”
The wording of the survey item is as follows:
A working mother can establish just as warm and secure of a relationship with her child as a mother who does not work.
We will filter the sample into one sample from the 1977 wave, and the other on the 1989 wave. The main variables of interest are:
male
: whether the respondent was male (1) or female
(0).white
: whether the respondent was white (1) or some
other race (0).age
: respondent’s age in years.education
: respondent’s years of education.prestige
: completed prestige % of other survey
respondents rating this R’s job as prestigious.## Load data
workmom <- read_csv("https://faculty.washington.edu/cadolph/mle/ordwarm2.csv")
names(workmom)
## [1] "warm" "yr89" "male" "white" "age" "ed" "prst"
## [8] "warmlt2" "warmlt3" "warmlt4"
workmom77 <- workmom |> filter(yr89==0)
workmom89 <- workmom |> filter(yr89==1)
# Data from 1977, 1989 GSS: Attitudes towards working mothers
## Mother can have warm feelings towards child?
y <- workmom77$warm
## selecting covariates
x <- workmom77 |>
select(male,white,age,ed,prst) |>
as.matrix()
## male respondent; white resp; age of resp;
## years of education of respondent;
## prestige of respondent's occupation (% considering prestigious)
optim
and polr
We will first estimate an ordinal probit using a likelihood function
and optim()
. After, we will re-estimate the model but this
time using polr()
from MASS::
library.
Probabilities we want to estimate in four category case:
\[ \begin{aligned} \text{Pr}(y_i=1|\boldsymbol{x_i})&=\Phi(\tau_1-\alpha-\boldsymbol{x_i\beta})\\ \text{Pr}(y_i=2|\boldsymbol{x_i})&=\Phi(\tau_2-\alpha-\boldsymbol{x_i\beta})-\Phi(\tau_{1}-\alpha-\boldsymbol{x_i\beta})\\ \text{Pr}(y_i=3|\boldsymbol{x_i})&=\Phi(\tau_3-\alpha-\boldsymbol{x_i\beta})-\Phi(\tau_{2}-\alpha-\boldsymbol{x_i\beta})\\ \text{Pr}(y_i=4|\boldsymbol{x_i})&=1-\Phi(\tau_3-\alpha-\boldsymbol{x_i\beta}) \end{aligned} \]
To identify the model, we commonly make one of two assumptions:
Assume that \(\tau_{1}=0\). This
is also the identifying assumption of logit and probit.
optim()
uses this.
Assume that \(\alpha=0\).
polr()
uses this.
The likelihood function for ordered probit finds the \(\boldsymbol{\beta}\) and \(\tau\) that make the observed data most likely.
# Model specification (for simcf)
model <- warm ~ male + white + age + ed + prst
## Likelihood for 4 category ordered probit
llk.oprobit4 <- function(param, x, y) {
# preliminaries
os <- rep(1, nrow(x))
x <- cbind(os, x)
b <- param[1:ncol(x)]
t2 <- param[(ncol(x)+1)]
t3 <- param[(ncol(x)+2)]
t4 <- param[(ncol(x)+3)]
# probabilities and penalty function
xb <- x%*%b
p1 <- log(pnorm(-xb))
# penalty function to keep t2>0
if (t2<=0) p2 <- -(abs(t2)*10000)
else p2 <- log(pnorm(t2-xb)-pnorm(-xb))
# penalty to keep t3>t2
if (t3<=t2) p3 <- -((t2-t3)*10000)
else p3 <- log(pnorm(t3-xb)-pnorm(t2-xb))
# penalty to keep t3>t2
if (t3<=t2) p3 <- -((t2-t3)*10000)
else p3 <- log(pnorm(t3-xb)-pnorm(t2-xb))
p4 <- log(1-pnorm(t3-xb))
# -1 * log likelihood (optim is a minimizer)
-sum(cbind(y==1,y==2,y==3,y==4) * cbind(p1,p2,p3,p4))
}
# MLE via optim
## use ls estimates as starting values
ls.result <- lm(model, data=workmom77)
# initial guesses
stval <- c(coef(ls.result),1,2)
oprobit.res77 <- optim(stval,
llk.oprobit4,
method="BFGS",
x=x,
y=y,
hessian=TRUE)
(pe77 <- oprobit.res77$par) # point estimates
## (Intercept) male white age ed prst
## 1.321689587 -0.397228916 -0.238335818 -0.011703265 0.044989779 0.002116029
##
## 1.016421525 2.077970893
(vc77 <- solve(oprobit.res77$hessian)) # var-cov matrix
## (Intercept) male white age
## (Intercept) 3.205974e-02 -1.832887e-03 -5.475643e-03 -2.262774e-04
## male -1.832887e-03 3.414003e-03 -3.709616e-05 6.780298e-06
## white -5.475643e-03 -3.709616e-05 8.296760e-03 -8.602415e-06
## age -2.262774e-04 6.780298e-06 -8.602415e-06 3.670450e-06
## ed -1.348305e-03 1.698673e-05 -6.289883e-05 8.993977e-06
## prst 3.980131e-05 -1.157565e-05 -2.126967e-05 -1.186523e-06
## 1.163657e-03 -1.141861e-04 -1.208110e-04 -5.034573e-06
## 1.745683e-03 -3.829872e-04 -2.107986e-04 -1.175574e-05
## ed prst
## (Intercept) -1.348305e-03 3.980131e-05 1.163657e-03 1.745683e-03
## male 1.698673e-05 -1.157565e-05 -1.141861e-04 -3.829872e-04
## white -6.289883e-05 -2.126967e-05 -1.208110e-04 -2.107986e-04
## age 8.993977e-06 -1.186523e-06 -5.034573e-06 -1.175574e-05
## ed 1.458304e-04 -1.844602e-05 2.900561e-05 4.117394e-05
## prst -1.844602e-05 6.691980e-06 -6.657005e-07 2.512622e-06
## 2.900561e-05 -6.657005e-07 1.688779e-03 1.414225e-03
## 4.117394e-05 2.512622e-06 1.414225e-03 2.962733e-03
(se77 <- sqrt(diag(vc77))) # standard errors
## (Intercept) male white age ed prst
## 0.179052337 0.058429471 0.091086552 0.001915842 0.012076026 0.002586886
##
## 0.041094751 0.054430991
(ll77 <- -oprobit.res77$value) # likelihood at maximum
## [1] -1758.824
# Use MASS::polr to do ordered probit (need to turn DV into categorical)
workmom77 <-
workmom77 |>
mutate(
warm_f = factor(warm,
labels = c("Strongly Disagree",
"Disagree",
"Agree",
"Strongly Agree"))
) |> select(warm_f,everything())
glm.res77 <- MASS::polr(warm_f ~ male + white + age + ed + prst,
data=workmom77,
method="probit",
Hess = TRUE,
na.action=na.omit)
summary(glm.res77) # polr output
## Call:
## MASS::polr(formula = warm_f ~ male + white + age + ed + prst,
## data = workmom77, na.action = na.omit, Hess = TRUE, method = "probit")
##
## Coefficients:
## Value Std. Error t value
## male -0.397225 0.058429 -6.7984
## white -0.238340 0.091087 -2.6166
## age -0.011701 0.001916 -6.1077
## ed 0.044996 0.012076 3.7260
## prst 0.002115 0.002587 0.8175
##
## Intercepts:
## Value Std. Error t value
## Strongly Disagree|Disagree -1.3216 0.1791 -7.3809
## Disagree|Agree -0.3052 0.1773 -1.7216
## Agree|Strongly Agree 0.7564 0.1776 4.2597
##
## Residual Deviance: 3517.649
## AIC: 3533.649
pe77 # optim ouput
## (Intercept) male white age ed prst
## 1.321689587 -0.397228916 -0.238335818 -0.011703265 0.044989779 0.002116029
##
## 1.016421525 2.077970893
# we can retrive the same threhsolds by subtracting the intercepts
0 - pe77[1] # tau 1
## (Intercept)
## -1.32169
pe77[7] - pe77[1]# tau 2
##
## -0.3052681
pe77[8] - pe77[1]# tau 3
##
## 0.7562813
Notice that the estimation of the coefficients using either
optim
or polr
is practically the same.
In the optim
routine, where we have estimated the
intercept and applied the constraint of 0 in \(\tau_1\), we can get the same thresholds
and cutoff points as in polr
using the above formula of
\(\tau_1-\alpha\).
Now that we have estimated an ordinal probit model, the next step, as
usual, is to predict quantities of interest using
simcf()
.
In some cases, it may be more useful to report the marginal effect of a discrete change. This is straightforward for binary discrete variables: we can simply compute the expected value for a specific dichotomous state, such as \(\mathbb{E}[Y \mid \text{male} = 0]\) or \(\mathbb{E}[Y \mid \text{male} = 1]\). Alternatively, we can calculate the first difference: \(\mathbb{E}[Y \mid \text{male} = 1] - \mathbb{E}[Y \mid \text{male} = 0]\).
However, discrete changes for continuous variables, such as
age
in this example, are less straightforward. There are
various strategies to compute and present meaningful discrete changes
for continuous variables that can be reported.
Three common strategies are:
Theoretically meaningful quantities: For example, we might hypothesize meaningful cohort effects between respondents in their 20s or younger compared to those in their 50s or older. While this approach may seem arbitrary, it is motivated by theoretical considerations and requires assumptions based on domain knowledge.
Standard deviation changes: This involves comparing the mean minus one standard deviation with the mean plus one standard deviation. However, caution is necessary if the distribution is highly skewed; in such cases, it may be better to use the third strategy.
Changes in quantiles: Select quantiles that offer a clear contrast, such as the interquartile range (comparing the 3rd quartile with the 1st quartile) or comparing the 80th percentile to the 20th percentile.
In general, I have found that the third strategy often provides the best representation of discrete changes, though it is important to acknowledge that each approach involves some degree of arbitrariness.
## Q: how many point estimates does polr provide? Think about how to fit them in simcf
# Simulate parameters from predictive distributions
sims <- 10000
simbetas <- mvrnorm(sims, pe77, vc77) # draw parameters, using MASS::mvrnorm
# Create example counterfactuals
xhyp <- cfMake(model, workmom77, nscen=10)
xhyp <- cfName(xhyp,"Male", scen=1)
xhyp <- cfChange(xhyp, "male", x=1, xpre=0, scen=1)
xhyp <- cfName(xhyp, "Female", scen=2)
xhyp <- cfChange(xhyp, "male", x=0, xpre=1, scen=2)
xhyp <- cfName(xhyp, "Nonwhite", scen=3)
xhyp <- cfChange(xhyp, "white", x=0, xpre=1, scen=3)
xhyp <- cfName(xhyp, "White", scen=4)
xhyp <- cfChange(xhyp, "white", x=1, xpre=0, scen=4)
xhyp <- cfName(xhyp, "Age + 1sd = 61", scen=5)
xhyp <- cfChange(xhyp, "age",
x=mean(na.omit(workmom77$age))+sd(na.omit(workmom77$age)),
xpre=mean(na.omit(workmom77$age)),
scen=5)
xhyp <- cfName(xhyp, "Age - 1sd = 28", scen=6)
xhyp <- cfChange(xhyp, "age",
x=mean(na.omit(workmom77$age))-sd(na.omit(workmom77$age)),
xpre=mean(na.omit(workmom77$age)),
scen=6)
xhyp <- cfName(xhyp, "High School Grad", scen=7)
xhyp <- cfChange(xhyp, "ed", x=12, xpre=mean(na.omit(workmom77$ed)), scen=7)
xhyp <- cfName(xhyp,"College Grad", scen=8)
xhyp <- cfChange(xhyp, "ed", x=16, xpre=mean(na.omit(workmom77$ed)), scen=8)
xhyp <- cfName(xhyp,"High Prestige Job (+1 sd)", scen=9)
xhyp <- cfChange(xhyp, "prst",
x=mean(na.omit(workmom77$prst))+sd(na.omit(workmom77$prst)),
xpre=mean(na.omit(workmom77$prst)),
scen=9)
xhyp <- cfName(xhyp,"Low Prestige Job (-1 sd)", scen=10)
xhyp <- cfChange(xhyp, "prst",
x=mean(na.omit(workmom77$prst))-sd(na.omit(workmom77$prst)),
xpre=mean(na.omit(workmom77$prst)),
scen=10)
# Simulate expected probabilities (all four categories)
oprobit.ev77 <- oprobitsimev(xhyp, simbetas, cat=4)
# Simulate first differences (all four categories)
oprobit.fd77 <- oprobitsimfd(xhyp, simbetas, cat=4)
# Simulate relative risks (all four categories)
oprobit.rr77 <- oprobitsimrr(xhyp, simbetas, cat=4)
ggplot
# Plot predicted probabilities for all four categories, sorted by size
## ggplot
CatNames <-
c("Strongly\n Disagree",
"Disagree",
"Agree",
"Strongly\n Agree")
scenNames <- row.names(xhyp$x)
oprobit.ev77.tidy <-
oprobit.ev77 |>
# Set names for columns
map(set_colnames, CatNames) |>
# Set names for rows
map(set_rownames, scenNames) |>
# collapse list into data frame
map_df(as_tibble, rownames = "scenNames", .id = "Type") |>
# Wrangle to long format
pivot_longer(CatNames[1]:CatNames[4], names_to = "Category") |>
# Wrangle to ggplot friendly
pivot_wider(names_from = Type, values_from = value) |>
mutate(Category = as_factor(Category)) |>
# Grouping by category first
group_by(Category) |>
# fct_reorder to sort scenNames by the pe of Strongly Dis
mutate(scenNames = fct_reorder(scenNames, pe, .desc = TRUE))
oprobit.ev77.tidy |>
ggplot() +
aes(y = scenNames,
x = pe,
xmax = upper,
xmin = lower) +
ggstance::geom_pointrangeh(size = 0.3) +
scale_x_continuous(limits = c(0, 0.5),
breaks = seq(0.1, 0.4, by =0.1),
# dup_axis() creates symmetric axes
sec.axis = dup_axis(name = NULL)) +
facet_grid(~ Category)+
labs(x = "Predicted probability",
y = NULL) +
# theme from Brian
theme_caviz_vgrid +
theme(aspect.ratio = 2,
axis.text.x = element_text(color = "black", size = 10),
axis.ticks.x = element_blank())
tile
## tile
sorted <- order(oprobit.ev77$pe[,1])
scenNames <- row.names(xhyp$x)
trace1 <- ropeladder(x = oprobit.ev77$pe[sorted,1],
lower = oprobit.ev77$lower[sorted,1],
upper = oprobit.ev77$upper[sorted,1],
labels = scenNames[sorted],
size=0.5,
lex=1.5,
plot=1
)
trace2 <- ropeladder(x = oprobit.ev77$pe[sorted,2],
lower = oprobit.ev77$lower[sorted,2],
upper = oprobit.ev77$upper[sorted,2],
size=0.5,
lex=1.5,
plot=2
)
trace3 <- ropeladder(x = oprobit.ev77$pe[sorted,3],
lower = oprobit.ev77$lower[sorted,3],
upper = oprobit.ev77$upper[sorted,3],
size=0.5,
lex=1.5,
plot=3
)
trace4 <- ropeladder(x = oprobit.ev77$pe[sorted,4],
lower = oprobit.ev77$lower[sorted,4],
upper = oprobit.ev77$upper[sorted,4],
size=0.5,
lex=1.5,
plot=4
)
tile(trace1, trace2, trace3, trace4,
limits = c(0,0.5),
gridlines = list(type="xt"),
topaxis=list(add=TRUE, at=c(0,0.1,0.2,0.3,0.4,0.5)),
xaxistitle=list(labels="probability"),
topaxistitle=list(labels="probability"),
plottitle=list(labels=c("Strongly Disagree", "Disagree",
"Agree", "Strongly Agree")),
width=list(spacer=3),
height = list(plottitle=3,xaxistitle=3.5,topaxistitle=3.5)
)
Above, we computed discrete marginal associations for the probabilities of agreeing with the statement from the survey item. However, we can improve the communication of these results by collapsing the four levels into two broader categories: combining “Strongly Disagree” with “Disagree” and “Agree” with “Strongly Agree.”
To collapse ordered categories, provide a list of concatenation in
the recode=
argument from simcf::oprobitsimev
.
The code example below demonstrates this approach.
## Re-simulate, now collapsing presentation to two categories ("SD/D" vs "SA/A")
## Simulate expected probabilities (collapsing categories)
oprobit.ev77c <- oprobitsimev(xhyp, simbetas, cat=4,
recode=list(c(1,2), c(3,4)) )
## Simulate first differences (collapsing categories)
oprobit.fd77c <- oprobitsimfd(xhyp, simbetas, cat=4,
recode=list(c(1,2), c(3,4)) )
## Simulate relative risks (collapsing categories)
oprobit.rr77c <- oprobitsimrr(xhyp, simbetas, cat=4,
recode=list(c(1,2), c(3,4)) )
ggplot
## ggplot
### ropeladder: plot, EV of Dd vs aA
CatNames.c <-
c("Disagree or Str Disagree", "Agree or Str Agree")
oprobit.ev77c.tidy <-
oprobit.ev77c |>
map(set_colnames, CatNames.c) |> #Set names for columns
map(set_rownames, scenNames) |> #Set names for rows
map_df(as_tibble, rownames = "scenNames", .id = "Type") |> #collapse list into data frame
pivot_longer(CatNames.c[1]:CatNames.c[2], names_to = "Category") |> #Wrangle to long format
pivot_wider(names_from = Type, values_from = value) |> #Wrangle to ggplot friendly
mutate(Category = as_factor(Category)) |>
group_by(Category) |> # Grouping by category first
mutate(scenNames = fct_reorder(scenNames, pe, .desc = TRUE)) # helps fct_reorder to sort scenNames by the pe of Strongly Dis
oprobit.ev77c.tidy |>
ggplot() +
aes(y = scenNames,
x = pe,
xmax = upper,
xmin = lower) +
geom_pointrangeh(size = 0.3) +
scale_x_continuous(sec.axis = dup_axis(name = NULL)) + # dup_axis() creates symmetric axes
facet_grid(~ Category)+
labs(x = "Predicted probability",
y = NULL,
title = "Working Mothers Can Be Warm to Kids") +
theme_caviz_vgrid + # theme from Brian
theme(aspect.ratio = 2,
axis.text.x = element_text(color = "black", size = 10),
axis.ticks.x = element_blank())
### ropeladder: Plot to show only "SA/A"
oprobit.ev77c.tidy |>
filter(Category == CatNames.c[2]) |>
ggplot() +
aes(y = scenNames,
x = pe,
xmax = upper,
xmin = lower) +
geom_pointrangeh(size = 0.3) +
scale_x_continuous(sec.axis = dup_axis(name = NULL)) + # dup_axis() creates symmetric axes
labs(x = "Predicted probability",
y = NULL,
title = "Working Mothers Can Be Warm to Kids") +
theme_caviz_vgrid + # theme from Brian
theme(axis.text.x = element_text(color = "black", size = 10),
axis.ticks.x = element_blank())
tile
## tile
## Make a new rl plot, EV of Dd vs aA
trace1b <- ropeladder(x = oprobit.ev77c$pe[sorted,1],
lower = oprobit.ev77c$lower[sorted,1],
upper = oprobit.ev77c$upper[sorted,1],
labels = scenNames[sorted],
size=0.65,
lex=1.75,
plot=1
)
trace2b <- ropeladder(x = oprobit.ev77c$pe[sorted,2],
lower = oprobit.ev77c$lower[sorted,2],
upper = oprobit.ev77c$upper[sorted,2],
size=0.65,
lex=1.75,
plot=2
)
tile(trace1b, trace2b,
limits = c(0.35,0.65),
gridlines = list(type="xt"),
xaxis=list(at=c(0.4, 0.5, 0.6)),
topaxis=list(add=TRUE, at=c(0.4, 0.5, 0.6)),
xaxistitle=list(labels="probability"),
topaxistitle=list(labels="probability"),
plottitle=list(labels=c("Disagree or Str Disagree",
"Agree or Str Agree")),
width=list(spacer=3),
height = list(plottitle=3,xaxistitle=3.5,topaxistitle=3.5)
)
## Revise traces and plot to show only "SA/A"
trace2b$entryheight <- 0.2
trace2b$plot <- 1
trace2b$labels <- trace1b$labels
tile(trace2b,
limits = c(0.35,0.65),
gridlines = list(type="xt"),
xaxis=list(at=seq(0.35, 0.65, 0.05)),
topaxis=list(add=TRUE, at=seq(0.35, 0.65, 0.05)),
xaxistitle=list(labels="probability agree or strongly agree, 1977"),
topaxistitle=list(labels="probability agree or strongly agree, 1977"),
plottitle=list(labels="\"Working Mothers Can Be Warm to Kids\""),
width=list(plot=2.5),
height = list(plottitle=3,xaxistitle=3.5,topaxistitle=3.5)
)
Now we can re-do the analysis for the sample from 1989.
## Now estimate 1989 model
y <- workmom89$warm # Mother can have warm feelings towards child?
x <- workmom89 |> select(male,white,age,ed,prst) |> as.matrix()
# Use optim directly to get MLE
ls.result <- lm(model, data=workmom89) # use ls estimates as starting values
stval <- c(coef(ls.result),1,2) # initial guesses
oprobit.res89 <- optim(stval,
llk.oprobit4,
method="BFGS",
x=x,
y=y,
hessian=TRUE)
(pe89 <- oprobit.res89$par) # point estimates
## (Intercept) male white age ed prst
## 1.983971961 -0.442980617 -0.206295345 -0.012979963 0.027631612 0.005149637
##
## 1.198280077 2.414641320
(vc89 <- solve(oprobit.res89$hessian)) # var-cov matrix
## (Intercept) male white age
## (Intercept) 5.243718e-02 -2.275698e-03 -7.298131e-03 -3.041666e-04
## male -2.275698e-03 5.348504e-03 -4.601340e-04 1.458896e-05
## white -7.298131e-03 -4.601340e-04 1.160058e-02 -1.972153e-05
## age -3.041666e-04 1.458896e-05 -1.972153e-05 4.881033e-06
## ed -2.090501e-03 -5.626598e-05 -1.219894e-04 8.923750e-06
## prst 1.027141e-05 -2.043623e-06 -6.284932e-06 -8.863255e-07
## 3.805373e-03 -3.094736e-04 -5.391857e-05 -1.414865e-05
## 4.875512e-03 -6.956017e-04 -2.618378e-04 -2.451509e-05
## ed prst
## (Intercept) -2.090501e-03 1.027141e-05 3.805373e-03 4.875512e-03
## male -5.626598e-05 -2.043623e-06 -3.094736e-04 -6.956017e-04
## white -1.219894e-04 -6.284932e-06 -5.391857e-05 -2.618378e-04
## age 8.923750e-06 -8.863255e-07 -1.414865e-05 -2.451509e-05
## ed 2.211948e-04 -2.399284e-05 2.138751e-05 3.999702e-05
## prst -2.399284e-05 8.465097e-06 6.235541e-06 1.032785e-05
## 2.138751e-05 6.235541e-06 4.430397e-03 3.931261e-03
## 3.999702e-05 1.032785e-05 3.931261e-03 6.139354e-03
(se89 <- sqrt(diag(vc89))) # standard errors
## (Intercept) male white age ed prst
## 0.228991670 0.073133468 0.107705986 0.002209306 0.014872619 0.002909484
##
## 0.066561229 0.078354032
(ll89 <- -oprobit.res89$value) # likelihood at maximum
## [1] -1082.657
# re-estimating the model with the sample from 1977
glm.res77 <- MASS::polr(warm_f ~ male + white + age + ed + prst,
data=workmom77,
method="probit",
na.action=na.omit)
(pe77 <- c(coef(glm.res77),glm.res77$zeta))
## male white
## -0.397225180 -0.238339515
## age ed
## -0.011701218 0.044995843
## prst Strongly Disagree|Disagree
## 0.002114664 -1.321570640
## Disagree|Agree Agree|Strongly Agree
## -0.305164619 0.756389946
(vc77 <- vcov(glm.res77))
## male white age
## male 3.413998e-03 -3.709926e-05 6.780050e-06
## white -3.709926e-05 8.296751e-03 -8.602815e-06
## age 6.780050e-06 -8.602815e-06 3.670353e-06
## ed 1.698540e-05 -6.290118e-05 8.993936e-06
## prst -1.157493e-05 -2.126878e-05 -1.186494e-06
## Strongly Disagree|Disagree 1.832883e-03 5.475612e-03 2.262743e-04
## Disagree|Agree 1.718697e-03 5.354805e-03 2.212394e-04
## Agree|Strongly Agree 1.449894e-03 5.264818e-03 2.145182e-04
## ed prst
## male 1.698540e-05 -1.157493e-05
## white -6.290118e-05 -2.126878e-05
## age 8.993936e-06 -1.186494e-06
## ed 1.458303e-04 -1.844571e-05
## prst -1.844571e-05 6.691778e-06
## Strongly Disagree|Disagree 1.348311e-03 -3.980224e-05
## Disagree|Agree 1.377316e-03 -4.046800e-05
## Agree|Strongly Agree 1.389483e-03 -3.728944e-05
## Strongly Disagree|Disagree Disagree|Agree
## male 1.832883e-03 0.0017186966
## white 5.475612e-03 0.0053548051
## age 2.262743e-04 0.0002212394
## ed 1.348311e-03 0.0013773156
## prst -3.980224e-05 -0.0000404680
## Strongly Disagree|Disagree 3.205968e-02 0.0308960118
## Disagree|Agree 3.089601e-02 0.0314211044
## Agree|Strongly Agree 3.031397e-02 0.0305645167
## Agree|Strongly Agree
## male 1.449894e-03
## white 5.264818e-03
## age 2.145182e-04
## ed 1.389483e-03
## prst -3.728944e-05
## Strongly Disagree|Disagree 3.031397e-02
## Disagree|Agree 3.056452e-02
## Agree|Strongly Agree 3.153099e-02
sims <- 10000
# draw parameters, using MASS::mvrnorm
simbetas89 <- mvrnorm(sims, pe89, vc89)
# draw parameters, using MASS::mvrnorm
simbetas77 <- mvrnorm(sims, pe77, vc77)
# Create example counterfactuals -- for diffs
xhyp <- cfMake(model, workmom89, nscen=5)
xhyp <- cfName(xhyp, "Female (Male)", scen=1)
xhyp <- cfChange(xhyp, "male", x=0, xpre=1, scen=1)
xhyp <- cfName(xhyp, "Nonwhite (White)", scen=2)
xhyp <- cfChange(xhyp, "white", x=0, xpre=1, scen=2)
xhyp <- cfName(xhyp, "28 Year Olds (61)", scen=3)
xhyp <- cfChange(xhyp, "age",
x=mean(na.omit(workmom89$age))-sd(na.omit(workmom89$age)),
xpre=mean(na.omit(workmom89$age)),
scen=3)
xhyp <- cfName(xhyp,"College Grad (High School)", scen=4)
xhyp <- cfChange(xhyp, "ed", x=16, xpre=12, scen=4)
xhyp <- cfName(xhyp,"High Prestige Job (Low)", scen=5)
xhyp <- cfChange(xhyp, "prst",
x=mean(na.omit(workmom89$prst))+sd(na.omit(workmom89$prst)),
xpre=mean(na.omit(workmom89$prst)) - sd(na.omit(workmom89$prst)),
scen=5)
# Simulate expected probabilities (all four categories)
oprobit.ev89 <- oprobitsimev(xhyp, simbetas89, cat=4)
# Simulate first differences (all four categories)
oprobit.fd89 <- oprobitsimfd(xhyp, simbetas89, cat=4)
# Simulate relative risks (all four categories)
oprobit.rr89 <- oprobitsimrr(xhyp, simbetas89, cat=4)
# Re-simulate, now collapsing presentation to two categories ("SD/D" vs "SA/A")
# Simulate first differences, 1989
oprobit.fd89c <- oprobitsimfd(xhyp, simbetas89, cat=4,
recode=list(c(1,2), c(3,4)) )
# Simulate relative risks, 1989
oprobit.rr89c <- oprobitsimrr(xhyp, simbetas89, cat=4,
recode=list(c(1,2), c(3,4)) )
# Simulate first differences, 1977
oprobit.fd77c <- oprobitsimfd(xhyp, simbetas77, cat=4,
constant=NA, # for polr
recode=list(c(1,2), c(3,4)) )
# Simulate relative risks, 1977
oprobit.rr77c <- oprobitsimrr(xhyp, simbetas77, cat=4,
constant=NA, # for polr
recode=list(c(1,2), c(3,4)) )
# Make a new ropeladder plot, showing just change in probability of any agreement
## ggplot
scenNames <- row.names(xhyp$x)
CatNames.c <-
c("Disagree or Str Disagree", "Agree or Str Agree")
oprobit.fd89c.tidy <-
bind_rows(oprobit.fd77c, oprobit.fd89c) |>
map(set_colnames, CatNames.c) |>
#Set names for rows
map(set_rownames, rep(scenNames, 2)) |>
#collapse list into data frame
map_df(as_tibble, rownames = "scenNames", .id = "Type") |>
add_column(Year = rep(c(1979, 1989), each =5) |> rep(3)) |>
#Wrangle to long format
pivot_longer(CatNames.c[1]:CatNames.c[2], names_to = "Category") |>
#Wrangle to ggplot friendly
pivot_wider(names_from = Type, values_from = value) |>
mutate(Category = as_factor(Category)) |>
# Grouping by category first
group_by(Year, Category) |>
# fct_reorder to sort scenNames by the pe of Strongly Dis
mutate(scenNames = fct_reorder(scenNames, pe, .desc = TRUE),
# Add 95% significance
conf95 = case_when(upper>0 & lower>0 ~ TRUE,
upper<0 & lower<0 ~ TRUE,
TRUE ~ FALSE))
oprobit.fd89c.tidy |>
filter(Category == CatNames.c[2]) |>
ggplot() +
aes(y = scenNames,
x = pe,
xmax = upper,
xmin = lower,
color = factor(Year),
shape = conf95,
label = factor(Year)) +
# points with intervals
geom_pointrangeh(size = 0.6, fill = "white",
#Plot lines parallelly
position = position_dodge2v(height = 0.6,
reverse = T)) +
# include labels
geom_text(size = 4, nudge_y = c(rep(0.3, 5), rep(-0.3, 5)))+
# dup_axis() creates symmetric axes
scale_x_continuous(sec.axis = dup_axis(name = NULL)) +
# For line color
scale_color_manual(values = c(orange, blue), guide = FALSE)+
# For shape color
scale_shape_manual(values = c(21, 19), guide = FALSE)+
labs(x = "Predicted probability",
y = NULL,
title = "Working Mothers Can Be Warm to Kids") +
# theme from Brian
theme_caviz_vgrid +
theme(axis.text.x = element_text(color = "black", size = 10),
axis.ticks.x = element_blank())
## tile
sortedc <- rev(order(oprobit.fd89c$pe[,2]))
scenNames <- row.names(xhyp$x)
trace1c <- ropeladder(x = oprobit.fd77c$pe[sortedc,2],
lower = oprobit.fd77c$lower[sortedc,2],
upper = oprobit.fd77c$upper[sortedc,2],
labels = scenNames[sortedc],
sublabels="1979",
sublabelsyoffset=0.04,
col=orange,
size=0.65,
lex=1.75,
plot=1
)
trace2c <- ropeladder(x = oprobit.fd89c$pe[sortedc,2],
lower = oprobit.fd89c$lower[sortedc,2],
upper = oprobit.fd89c$upper[sortedc,2],
labels = scenNames[sortedc],
sublabels = "1989",
sublabelsyoffset = -0.04,
col=blue,
size=0.65,
lex=1.75,
entryheight=0.40,
subentryheight=.8,
plot=1
)
sigMark1 <- oprobit.fd77c$pe[sortedc,2]
is.na(sigMark1) <- (oprobit.fd89c$lower[sortedc,2]>0)
traceSig1 <- ropeladder(x=sigMark1,
col="white",
group=1,
plot=1)
sigMark2 <- oprobit.fd89c$pe[sortedc,2]
is.na(sigMark2) <- (oprobit.fd89c$lower[sortedc,2]>0)
traceSig2 <- ropeladder(x=sigMark2,
col="white",
group=2,
plot=1)
vertmark <- linesTile(x=c(0,0), y=c(0,1), plot=1)
tile(trace1c, trace2c, vertmark, traceSig1, traceSig2,
limits=c(-0.05,0.25),
gridlines=list(type="xt"),
topaxis=list(add=TRUE, at=seq(from=0, to=0.2, by=0.05),
labels=c("0%", "+5%", "+10%", "+15%", "+20%")),
xaxis=list(at=seq(from=0, to=0.2, by=0.05), labels=c("0%", "+5%", "+10%", "+15%", "+20%")),
xaxistitle=list(labels="difference in probability agree or strongly agree"),
topaxistitle=list(labels="difference in probability agree or strongly agree"),
plottitle=list(labels="\"Working Mothers Can Be Warm to Kids\""),
width=list(plot=2),
height=list(plottitle=3,xaxistitle=3.5,topaxistitle=3.5)
)