Skip to content
Blazej Mrozinski

Reliable Change Index (RCI)

Psychometrics
Reliable Change Index (RCI)

When you give someone the same test twice, their score will almost never be identical — and most of that difference is noise. The Reliable Change Index (RCI) is the tool that tells you which part of a score change is signal. It answers a deceptively simple question: is this person’s change between Time 1 and Time 2 larger than what we’d expect from measurement error alone?

The RCI comes from Jacobson and Truax’s 1991 work on clinical significance, and it has become the standard way to decide whether an individual — not a group — has genuinely changed. That individual focus is what makes it useful in clinical and forensic settings, where the decision is about one patient, not a sample mean.

The formula

The index is a difference score scaled by the error around it:

RCI = (x₂ − x₁) / S_diff

where x₁ and x₂ are the two observed scores and S_diff is the standard error of the difference. S_diff is built from the standard error of measurement (SEM):

S_diff = √2 × SEM, and SEM = SD × √(1 − reliability)

So the whole thing rests on two properties of the instrument: how much the scores vary in the population (SD) and how reliable the test is. A more reliable instrument has a smaller SEM, a smaller S_diff, and therefore detects smaller genuine changes. An unreliable test needs a large swing before you can call the change real.

Reading the number

Because RCI is expressed in standard-error units, it behaves like a z-score. The conventional threshold is |RCI| > 1.96, corresponding to p < .05: a change that large would occur by chance less than 5% of the time if nothing had actually changed. Cross that threshold and the change is “reliable” — unlikely to be measurement noise.

What it doesn’t tell you

Reliable change is not the same as clinically meaningful change. The RCI tells you a change is statistically real; it doesn’t tell you the person has crossed from a clinical to a non-clinical range, which is the second half of Jacobson and Truax’s framework (a cutoff between dysfunctional and functional populations). A patient can show reliable improvement and still be symptomatic; the two criteria are reported together precisely because each answers a different question.

This is the method underneath repeated testing in any panel that tracks people over time — it’s how the forensic-psychiatry diagnostic panel I built for KPS reports change between administrations, rather than presenting a raw before-and-after that would treat measurement noise as if it were progress. It rests on the same Classical Test Theory machinery — true scores, error, and reliability — that underlies most practical psychometric work.

Related on this site

See also