Description
A 4×4 cross-classification of income level (ordinal) against job-satisfaction level (ordinal) from a US survey. Used to illustrate ordinal-association measures (Kendall’s , Goodman-Kruskal ) — the analogues of Pearson’s correlation for ordered categorical data.
| Very Dissat. | Little Dissat. | Mod. Sat. | Very Sat. | |
|---|---|---|---|---|
| <15,000 | 1 | 3 | 10 | 6 |
| 15,000–25,000 | 2 | 3 | 10 | 7 |
| 25,000–40,000 | 1 | 6 | 14 | 12 |
| >40,000 | 0 | 1 | 9 | 11 |
Ordinal association
From For Ordinal Variables and Definition 4.1
A pair of subjects is:
- concordant — subject ranked higher on is also ranked higher on
- discordant — subject ranked higher on is ranked lower on
- tied — same rank on and/or
Let = # concordant pairs, = # discordant pairs.
Goodman-Kruskal : — easy to interpret; ignores ties.
Kendall’s : similar numerator, with a normalising denominator that handles ties; less sensitive to cut-points defining the categories.
Both measures lie in :
- → weak trend
- near → strong monotone (positive / negative) association
See Gamma Tau for a longer reference on these measures.
R code
x <- matrix(c(1, 3, 10, 6,
2, 3, 10, 7,
1, 6, 14, 12,
0, 1, 9, 11), ncol=4, byrow=TRUE)
dimnames(x) <- list(c("<15,000", "15,000-25,000",
"25,000-40,000", ">40,000"),
c("Very Dissat.", "Little Dissat.",
"Mod. Sat.", "Very Sat."))
us_svy_tab <- as.table(x)
library(DescTools)
output <- Desc(x, plotit = FALSE, verbose = 3)
output[[1]]$assocsPython code
import numpy as np
from scipy import stats
us_svy_tab = np.array([[1, 3, 10, 6],
[2, 3, 10, 7],
[1, 6, 14, 12],
[0, 1, 9, 11]])
dim1 = us_svy_tab.shape
x, y = [], []
for i in range(dim1[0]):
for j in range(dim1[1]):
for _ in range(us_svy_tab[i, j]):
x.append(i); y.append(j)
kt_output = stats.kendalltau(x, y)
print(f"Estimate of tau-b: {kt_output.statistic:.4f}.")
# The estimate of tau-b is 0.1524.Results: Goodman-Kruskal , Kendall’s . Both point to a weak positive association: higher-income respondents tend to report higher job satisfaction, though the effect is modest and the lower confidence limit is close to zero — borderline significant.
Note that in Python,
scipy.stats.kendalltauexpects two rank vectors, so we reconstruct them by “unrolling” the contingency table. R’sDescTools::Descworks directly on the table.
See also: L4 Exploring Categorical Data · Gamma Tau