Deciding which statistical test to use is the single most consequential choice in a thesis or journal article: the wrong test produces the wrong answer even from perfect data, and it is usually the first error reviewers catch. The good news is that test selection is not memorisation — it is a three-question decision process.
Question 1: What type is your dependent variable?
First be clear about what you measured. Continuous variables (scores, durations, age) lead to tests built on means; categorical variables (gender, yes/no, preference) lead to tests built on frequencies. Likert-type total scores are treated as continuous in practice, whereas a single Likert item is ordinal.
Question 2: How many groups or measurements are you comparing?
- One group, one measurement: descriptive statistics or one-sample tests.
- Two independent groups: a test from the independent-samples t-test family.
- Repeated measurements of the same people: paired (dependent) tests.
- Three or more groups: the ANOVA family; a significant result demands post-hoc comparisons.
- Relationship or prediction questions: the correlation and regression family.
Question 3: Are the parametric assumptions met?
If normality, homogeneity of variance and independence of observations hold, use a parametric test — they carry more statistical power. When the assumptions are seriously violated, switch to the nonparametric counterpart. That switch is not a downgrade in quality; it is simply the correct methodological choice.
Which statistical test to use: the decision table
| Design | Parametric | Nonparametric | Effect size |
|---|---|---|---|
| 2 independent groups | Independent t-test | Mann-Whitney U | Cohen's d / r |
| 2 dependent measurements | Paired t-test | Wilcoxon | Cohen's d |
| 3+ independent groups | One-way ANOVA | Kruskal-Wallis | η² / ε² |
| 3+ repeated measurements | Repeated-measures ANOVA | Friedman | Partial η² |
| Two continuous variables | Pearson's r | Spearman's rho | r |
| Prediction (multiple predictors) | Multiple regression | — | R², f² |
| 2 categorical variables | — | Chi-square / Fisher | Cramér's V / φ |
Beyond group comparisons: relationship patterns
Mediation and moderation questions call for the PROCESS macro or structural equation modelling (AMOS, SmartPLS, Mplus). In scale development studies, exploratory factor analysis (EFA) and confirmatory factor analysis (CFA) should be run on separate samples. These are exercises in model building rather than mere test selection, and they must be planned at the design stage — well before any data are collected.
The test serves the research question; the research question is never bent to fit the test.
Frequently Asked Questions
Can I run a t-test on Likert scale data?
The total or mean score of a multi-item scale is treated as continuous in practice, so a t-test is defensible when the distribution allows it. For a single Likert item, nonparametric tests are the safer choice.
My ANOVA is significant — how do I find which groups differ?
With post-hoc tests: Tukey HSD or Bonferroni when variances are homogeneous, and Games-Howell when they are not.
My normality tests are significant but my sample is large — what should I do?
In large samples (n > 200), Kolmogorov-Smirnov and Shapiro-Wilk are sensitive to even trivial departures. Judge normality from skewness-kurtosis coefficients and plots together; the central limit theorem usually makes parametric tests defensible in large samples.
Can I consult Celsus about test selection?
Yes. We review your design before data collection and deliver a written analysis plan matched to your research questions — so you never face the 'wrong data' problem after collection is finished.