Be Less Wrong

Why Use Quantitative Methods?

Author

Andy Grogan-Kaylor

Published

June 8, 2025

References

Box, George. 1979. “Robustness in the Strategy of Scientific Model Building.” In Robustness in Statistics, edited by Robert L. Launer and Graham N. Wilkinson. Academic Press, Inc. [Harcourt Brace Jovanovich, Publishers], New York-London.
Elwert, Felix, and Christopher Winship. 2014. “Endogenous Selection Bias: The Problem of Conditioning on a Collider Variable.” Annual Review of Sociology 40 (July): 31–53. https://doi.org/10.1146/annurev-soc-071913-043455.
Gelman, Andrew, Boris Shor, Joseph Bafumi, and David Park. 2007. “Rich State, Poor State, Red State, Blue State: What’s the Matter with Connecticut?” Quarterly Journal of Political Science 2 (November): 345–67. https://doi.org/10.2139/ssrn.1010426.
Hand, David J. 2014. Wonderful Examples, but Let’s not Close Our Eyes.” Statistical Science 29 (1): 98–100. https://doi.org/10.1214/13-STS446.
Lang, Jonas W. B., and Paul D. Bliese. in press. “Multilevel Research Designs.” In How to Get Published in the Best Industrial-Organizational Psychology Journals, edited by N. Bowling, M. K. Shoss, and Z. Zhou. Edward Elgar Publishing.
Launer, Robert L., and Graham N. Wilkinson. 1979. “Robustness in Statistics.” In Proceedings of a Workshop Held at the Army Research Office, Research Triangle Park, N.C., April 11–12, 1978, edited by Robert L. Launer and Graham N. Wilkinson, xvi+296. Academic Press, Inc. [Harcourt Brace Jovanovich, Publishers], New York-London.
Martín-Baró, Ignacio. 1994. “Toward a Liberation Psychology.” In Writings for a Liberation Psychology, edited by Adrianne Aron and Shawn Corne. Harvard University Press.
Nieuwenhuis, Rense. 2015. “Association, Aggregation, and Paradoxes: On the Positive Correlation Between Fertility and Women’s Employment.” Demographic Research 32 (March). https://www.demographic-research.org/volumes/vol32/23/.
Silverman, William A. 1998. “Non-Replication of the Replicable (1996).” In Where’s the Evidence? Debates in Modern Medicine. Oxford University Press.
Simpson, E H. 1951. “The Interpretation of Interaction in Contingency Tables.” Journal of the Royal Statistical Society. Series B (Methodological) 13: 238–41. http://www.jstor.org/stable/2984065.
Westreich, Daniel, and Sander Greenland. 2013. “The Table 2 Fallacy: Presenting and Interpreting Confounder and Modifier Coefficients.” American Journal of Epidemiology 177 (February): 292–98. https://doi.org/10.1093/aje/kws412.

Footnotes

  1. An analogous process can occur with multilevel data, in which there are often many groups, such as many schools, many neighborhoods, or many countries. Failure to account for the grouping of the data–in schools, neighborhoods or countries–can sometimes lead to dramatically incorrect results (Gelman et al. 2007; Nieuwenhuis 2015; Lang and Bliese in press).↩︎

  2. Since developing this tutorial, I’ve been reminded in some conversations about additional issues. For example, in this tutorial, I’m arguing for including as many control variables as possible. However, for some social issues, only small samples are available. Such small samples may be statistically underpowered, and may not have sufficient sample size to include many different control variables.↩︎

  3. Additionally, since developing this tutorial, I’ve also been reminded that one must be careful and thoughtful about choosing control variables. As a simple example, consider a hypothetical situation in which \(x\) is a cause of \(y\): \(x \rightarrow y\). If \(m\) is a mediator of the relationship between \(x\) and \(y\), then including \(m\) in one’s statistical model changes the meaning of the estimate of \(x\). \(\beta_x\) is now an estimate of the direct effect of \(x\) on \(y\), accounting for the presence of \(m\). There may be an indirect effect of \(x\) on \(y\) through \(m\) (\(x \rightarrow m \rightarrow\) y) that needs to be accounted for using special procedures (CF Westreich and Greenland 2013). Including a control variable \(c\) that is a function of both \(x\) and \(y\) (\(x \rightarrow c \ \& \ y \rightarrow\) c) may introduce additional complications (Elwert and Winship 2014).↩︎

  4. Further, one could argue, somewhat convincingly, that an RCT (randomized controlled trial) would solve the major issue inspiring this presentation. By randomly assigning study participants to a treatment and control group, we would avoid the possibility that our results could be statistically confounded by other factors, and thus avoid the possibility that our results would flip or substantially change as we add more variables to the model. However, what is not often enough acknowledged is that RCT’s are often based upon small clinically available or conveniently available samples that may not generalize well to other populations or people. Large observational studies with diverse populations–and models with many appropriate control variables–certainly have their role.↩︎