Thesis Data Analysis in SPSS: A Step-by-Step Guide

How to run thesis data analysis in SPSS step by step: data entry and cleaning, normality checks, choosing the right test, and APA 7 reporting.

Most postgraduate students hit the same wall the moment data collection ends: “The data are in — now what?” This guide walks through thesis data analysis in SPSS from data entry to APA-formatted reporting, one decision at a time. The goal is an analysis you can defend number by number in front of your examination committee.

1. Why thesis data analysis in SPSS starts with cleaning

A sound analysis begins with a properly structured data file. In Variable View, define the measurement level of every variable (nominal, ordinal, scale) correctly — a mislabelled measurement level is the single most common cause of choosing the wrong test later on.

  • Reverse-code negatively worded items: use Transform → Recode into Different Variables so the original values are preserved.
  • Inspect missing data: random missingness below 5% is rarely a problem; above that threshold, consider EM estimation or multiple imputation.
  • Screen for outliers: z-scores beyond ±3.29 flag univariate outliers, and Mahalanobis distance is the standard multivariate screen.
  • Compute reliability: report Cronbach's alpha for every scale and subscale (α ≥ .70 is the conventional acceptable threshold).

2. Checking normality: it decides which road you take

Parametric tests (t-tests, ANOVA, Pearson correlation) assume normally distributed data. Never rely on a single criterion: evaluate skewness and kurtosis coefficients (within ±1.5, or the more lenient ±2), histograms and Q-Q plots together, adding the Shapiro-Wilk test when the sample is small. The verdict shapes everything that follows.

3425.5178.5034t-test/ANOVA22Correlation19Regression13Chi-square12Nonparametric
Distribution of tests used across 400+ thesis analyses completed by Celsus, 2024–2026 (%)

3. Choose the right test: a decision table

Test selection comes down to three questions: what is the measurement level of the dependent variable, how many groups or measurements are being compared, and is the distribution normal? The table below summarises the scenarios you are most likely to face; for a deeper treatment, see our test selection decision guide.

Test selection by scenario
Research questionIf normalIf not normal
Means of two independent groupsIndependent-samples t-testMann-Whitney U
Two measurements of the same groupPaired-samples t-testWilcoxon signed-rank
Three or more independent groupsOne-way ANOVAKruskal-Wallis H
Relationship between two continuous variablesPearson correlationSpearman's rho
Predicting one dependent variableMultiple linear regressionBootstrap regression / transformation
Association between two categorical variablesChi-square test of independenceFisher's exact (small cells)

4. Run the analysis and read the output properly

Every test output contains three figures you must report: the test statistic (t, F, χ² and so on), the degrees of freedom and the p value. Yet by 2026, reviewers and committees care less about the p value than about the effect size: Cohen's d, eta-squared (η²) or the correlation coefficient itself. The p value tells you whether a difference could be chance; the effect size tells you whether it matters in practice.

A significant p value is not enough; when the committee asks 'how big is this difference?', your answer is the effect size.

5. Report in APA 7 format

For each test in your results chapter, the same pattern serves: descriptive statistics (mean, standard deviation), the test statistic with its p value, the effect size, and a plain-language interpretation. For example: “Post-test scores in the experimental group (M = 78.4, SD = 6.2) were significantly higher than in the control group, t(58) = 3.41, p = .001, d = 0.88 (a large effect).” Italicise statistical symbols and drop the leading zero for statistics that cannot exceed 1, such as p and r — our APA 7 reporting checklist covers the full set of conventions.

The five most common mistakes

  1. Choosing a parametric test without ever checking normality.
  2. Running many tests without correcting the p values (Bonferroni or similar may be required).
  3. Reporting no effect sizes at all.
  4. Defining Likert total scores at the wrong measurement level.
  5. Presenting the output as a pile of tables with no interpretation.

Frequently Asked Questions

How long does thesis data analysis in SPSS take?

For a typical master's thesis with cleaned data, analysis plus reporting takes 3–7 working days. Doctoral projects involving scale development or structural equation modelling take longer.

What should I do if my sample is small?

For groups below 30, nonparametric tests and Shapiro-Wilk normality checks are recommended; you should also justify the adequacy of your sample with a power analysis in G*Power.

My p value is just above .05 — what now?

Massaging the data (p-hacking) is an ethical violation. Instead, report the effect size, the confidence interval and a power analysis, and discuss the finding honestly.

What support does Celsus offer for SPSS analysis?

We provide end-to-end support: data cleaning, test selection, running the analysis, writing the results chapter in APA 7 format, and rehearsing the committee defence. Every deliverable comes with a reproducible syntax file.

← All posts