GlossaryHeuristic Evaluation

A structured way to catch usability problems before your users do

Heuristic evaluation is an expert review method where UX specialists assess an interface against a set of established usability principles. It's faster and cheaper than usability testing, and when done properly it surfaces real problems — not opinions.

What is Heuristic Evaluation?

Heuristic evaluation is a usability inspection method developed by Jakob Nielsen and Rolf Molich in 1990. UX experts review an interface and document usability problems based on a set of recognised principles — called heuristics. No users are involved. It's an expert judgment method, not a user research method.

The output is a prioritised list of usability issues, each mapped to a specific heuristic and rated by severity. Done well, it gives product teams a clear diagnostic of where the interface is working against users — without the time and cost of recruiting and running a usability study.

Nielsen's 10 Heuristics

Nielsen's 10 principles, published in their current form in 1994, remain the most widely used evaluation framework in the field:

  1. Visibility of system status — users should always know what the product is doing
  2. Match between system and real world — use language users know, not internal jargon or developer terminology
  3. User control and freedom — make it easy to undo, go back, and exit
  4. Consistency and standards — follow platform conventions; don't make users guess
  5. Error prevention — design to stop mistakes before they happen
  6. Recognition over recall — make options visible rather than relying on memory
  7. Flexibility and efficiency — let expert users move faster without confusing beginners
  8. Aesthetic and minimalist design — don't display information that competes with what's relevant
  9. Help users recognise, diagnose, and recover from errors — plain-language messages that explain what happened and what to do
  10. Help and documentation — provide it, but design so users rarely need it

Most B2B SaaS products fail consistently on 3, 5, and 6. Enterprise internal tools, built by engineers without UX input, tend to fail across most of the list.

Heuristic Evaluation vs. Usability Testing

These two methods are regularly confused or treated as substitutes. They're not — they answer different questions.

Heuristic evaluation uses UX experts as evaluators. It's fast (usually one to two days), affordable, and great at catching violations of known principles. What it can't tell you is how real users actually behave in your specific context, or whether a design change actually improved task completion.

Usability testing uses real users completing real tasks. It surfaces actual confusion, unexpected behaviours, and failure points that experts might not anticipate — especially in domain-specific products where user knowledge and mental models are highly contextual. It's slower and more expensive to run, but the evidence is direct.

Neither is sufficient on its own. Heuristic evaluation is best used as a first pass — to triage obvious issues quickly and cheaply — before investing in {{LINK:usability-testing}} with real users. Running them in sequence gives you both breadth and depth.

How the Process Works

The standard approach:

  1. Define the scope — which flows, screens, or tasks will be evaluated? A focused scope produces more actionable findings than a vague 'review the whole product.'
  2. Select 3–5 evaluatorsNielsen's research shows that a single evaluator catches around 35% of usability problems. Five evaluators typically surfaces 75–80%. Beyond five, the return diminishes quickly.
  3. Independent review — each evaluator works through the defined flows separately, documenting every issue they encounter. Independence is critical to avoid anchoring bias.
  4. Aggregate and deduplicate — findings from all evaluators are combined, duplicates merged, and issues organised by screen or flow.
  5. Rate severity — each issue is rated on a 0–4 scale (0 = not a problem, 4 = usability catastrophe) to drive prioritisation.
  6. Produce a findings report — specific issues, mapped to heuristics, with recommendations attached to each.

What Good Findings Look Like

The quality of a heuristic evaluation lives or dies on how findings are written. Vague findings get ignored. Specific, actionable findings get fixed.

A weak finding: 'The error message is unclear.'

A strong finding: 'The error message on the payment confirmation screen (Heuristic 9: error recovery) reads: "Transaction failed — code 401." Users have no way to identify what failed or what to do next. Recommended fix: replace with plain-language message stating the likely cause (e.g. card declined, session expired) and a clear next step.'

The difference is specificity — location, heuristic, business impact, recommendation. Without all four, the finding sits in a report and changes nothing.

When to Use It (and When Not To)

Heuristic evaluation is the right tool when:

  • You're baselining a product before a redesign and want to document what's broken before changing anything
  • You're QA-ing a new flow after a design sprint, before it goes to users
  • Time or budget rules out a usability study
  • You're evaluating internal tools or admin interfaces where recruiting actual users is impractical

It's not the right tool when:

  • You need to understand why users behave a specific way (you need real users for that)
  • The product is highly domain-specific and evaluator expertise in the domain matters as much as UX expertise
  • You need to validate that a change measurably improved outcomes

For full product diagnostics, heuristic evaluation is usually one component of a broader {{INTERNAL:/services/ux-audit}} — not a standalone answer.

Questions to Ask Before You Start

Before kicking off an evaluation, it's worth getting clarity on a few things:

  • Are the evaluators genuinely experienced with the type of interface being reviewed? A heuristic evaluation of a complex B2B workflow run by someone who mostly reviews consumer apps will miss a lot.
  • Is the scope defined clearly enough that all evaluators are reviewing the same thing?
  • How will findings be prioritised — by severity, frequency, or business impact? The answer shapes the report structure.
  • Who on the product team will receive the findings, and what authority do they have to act on them?

Without a clear answer to the last question, even a well-run evaluation ends up in a shared folder and nothing changes.

Related: {{LINK:usability-testing}}, {{LINK:ux-audit}}, {{LINK:ux-debt}}