Thematic Analysis with NVivo/MAXQDA Guide

Thematic analysis is a method for systematically identifying, analysing and reporting recurring patterns of meaning in qualitative data such as interviews, focus groups and documents. Because it is not tied to any single theoretical framework, it is the most widely chosen qualitative approach in postgraduate theses — but that flexibility is no excuse for a loosely run process. This guide walks through the six-phase procedure, codebook development, intercoder reliability, justifying saturation, and the practical differences between NVivo and MAXQDA.

The thematic analysis process: Six phases

Familiarisation: Read every transcript at least twice and capture first impressions in analytic memos. Doing your own transcription accelerates this phase considerably.
Generating initial codes: Attach a short, descriptive label to every data segment relevant to the research question; a single segment may carry several codes.
Searching for themes: Cluster codes into candidate themes on the basis of similarity and pattern, and sketch a draft thematic map.
Reviewing themes: Test each theme first against its coded extracts and then against the full dataset; merge overlapping themes and dissolve weak ones.
Defining and naming: Write a one- or two-sentence essence statement for each theme; names should be analytic, not merely descriptive.
Writing up: Present the themes as a narrative tied to the research question, supported by carefully selected extracts.

The process is iterative, not linear: reviewing themes routinely sends you back to coding. The question examiners ask most often is how this back-and-forth was documented — which is exactly why analytic memos and successive codebook versions should be archived.

Percentage share of analysis effort across the six phases in a typical thesis project (illustrative).

Inductive or deductive? Developing the codebook

In inductive coding the codes are derived from the data; in deductive coding you start from a template drawn from theory or prior literature. Most theses blend the two in practice: core codes come from the framework, while new codes are opened for unexpected patterns. State your choice explicitly in the methods chapter, because the reliability strategy you can defend depends on it.

A codebook records three things for every code: a definition, inclusion and exclusion criteria, and an example extract. The codebook is not frozen after the first few transcripts; each revision is logged with a date and a rationale. That version history becomes the backbone of the audit trail discussed below.

Intercoder reliability: The kappa debate

In codebook-style approaches, journals typically expect two independent coders to code 10–20% of the data and an agreement of Cohen's kappa ≥ 0.70; values of 0.61–0.80 count as substantial agreement and anything above 0.81 as almost perfect. Both NVivo and MAXQDA compute kappa automatically at the code and document level.

The reflexive thematic analysis tradition, however, objects to this expectation: themes are not waiting in the data to be 'discovered' but are generated through the researcher's theoretical position, so two coders applying the same label demonstrates a shared perspective rather than accuracy. The practical decision rule is this: if your study uses a structured codebook, a coding team and descriptive aims, report kappa; if it is a single-researcher, interpretive reflexive analysis, justify a reflexivity journal and peer debriefing instead. Trying to defend both at once invites the charge of methodological incoherence.

Saturation: When and how to justify it

Data saturation is the point at which additional interviews stop generating new codes or themes. The bare sentence 'saturation was reached' is not enough — reviewers want evidence. A defensible route is to tabulate the number of new codes by interview order and report, for example, that the share of new codes fell below 5% across the final three interviews. In the reflexive tradition, sample size can instead be justified through information power: sample specificity, the narrowness of the question and the quality of the dialogue. Whichever rationale you use, defend the number alongside the analysis, not before it.

Trustworthiness: The four criteria

The validity-and-reliability language of quantitative research is answered in qualitative work by four criteria. The table below summarises the strategies accepted in theses and journal articles for each criterion; your methods chapter should state, one by one, which of them you used.

Trustworthiness criteria and corresponding strategies
Criterion	Quantitative counterpart	Key strategies
Credibility	Internal validity	Member checking, prolonged engagement, data/investigator triangulation
Transferability	External validity	Thick description, detailed account of context and sampling
Dependability	Reliability	Audit trail, codebook version history, external audit
Confirmability	Objectivity	Reflexivity journal, peer debriefing, mapping findings back to raw data

NVivo or MAXQDA?

Both packages offer coding, codebook export, kappa calculation, matrix queries and visualisation — and neither will generate themes for you. The practical differences: MAXQDA is stronger on mixed-methods tooling (joint quantitative-qualitative tables, a built-in statistics module) and offers a multilingual interface; its document portraits and code maps are popular in theses. NVivo stands out for auto-coding, query flexibility and team coding on large projects (NVivo Collaboration), and is more common in institutional university licences. The choice usually comes down to your supervisor's and institution's ecosystem rather than analytic superiority — a flawless thematic analysis can be run in either.

Reporting themes with quotations

In the findings chapter, each theme follows the same structure: an analytic definition of the theme → two or three short extracts (with participant codes, e.g. P7) → your interpretation of the extracts and the link back to the literature. The most common failure is a wall of long, back-to-back quotations in which the interpretation disappears; an extract is evidence, not a substitute for narrative. Reporting how many participants contributed to a theme ('9 of 12 participants') is useful in descriptive studies, but take care not to reduce a qualitative finding to a frequency count. If your design adds a quantitative strand, see our statistical test selection guide as well.

Software codes; the researcher builds the theme — neither NVivo nor MAXQDA can do the thinking for you.

Frequently Asked Questions

How many interviews are enough for a thematic analysis?

There is no universal number; 12–20 interviews are common in thesis projects. What matters is justifying the figure with a saturation table or the information-power rationale. A narrow question in a homogeneous group needs fewer interviews; a broad question in a heterogeneous group needs more.

What should Cohen's kappa be, and is it required in every thematic analysis?

In codebook-based team studies, 0.70 or above is the usual expectation. In reflexive thematic analysis, kappa is considered methodologically inappropriate; a reflexivity journal and peer debriefing are reported instead. You need to be clear from the outset which tradition you are working in.

Should I choose NVivo or MAXQDA for my thesis?

If mixed-methods integration and a multilingual interface matter most, MAXQDA tends to win; if institutional licensing and team coding matter most, NVivo does. Both are analytically sufficient, and choosing the package your supervisor knows well will speed the process up.

What qualitative analysis services does Celsus offer?

We provide end-to-end support: interview guide development, codebook design after transcription, coding in NVivo/MAXQDA with a kappa report, saturation tables, documented trustworthiness strategies, and theme write-ups with quotations. Every deliverable comes with a full audit trail.