LSD vs DMRT vs Tukey vs Scheffe: picking a mean-separation test without inflating error
Four mean-separation tests on the same treatment means, ordered from most liberal to most conservative. Why LSD over-declares, why Scheffe under-declares, and where DMRT and Tukey sit between.
A significant ANOVA F says at least one treatment mean differs from the rest. It does not say which. The mean-separation test you reach for next decides how aggressively pairs are declared different. LSD, DMRT, Tukey and Scheffe sit on a spectrum from most liberal to most conservative, and picking from the wrong end either invents differences or hides real ones.
The decision in one sentence
Use LSD only with few treatments and a significant F, DMRT as the common agronomic middle ground, Tukey when you want honest control of the family-wise error across all pairwise comparisons, and Scheffe only for complex contrasts beyond simple pairs.
The spectrum
Test Controls Tendency Use when
LSD comparison-wise error most liberal few treatments,
protected by sig. F
DMRT a graded protection moderate many treatments,
level by range agronomic standard
Tukey HSD family-wise error conservative all pairwise, honest
across all pairs error control
Scheffe all possible most complex contrasts,
contrasts conservative not just pairsThe further right you go, the wider the critical difference, so the harder it is to declare two means different. LSD will find the most significant pairs, Scheffe the fewest, on identical data.
Why LSD over-declares
LSD controls the error rate for a single comparison, not for the whole family of comparisons. With k treatments there are k(k-1)/2 pairs, and the chance that at least one false difference appears grows quickly with k. That is why the textbook rule is to use LSD only after a significant ANOVA F (the protected LSD) and only when the number of treatments is small.
Why Scheffe under-declares
Scheffe protects every possible contrast, including complex linear combinations no one will test. For simple pairwise comparisons that protection is overkill, so its critical difference is the widest of the four and it rejects the fewest pairs. Scheffe earns its keep only when the question is a contrast, for example the mean of three nitrogen treatments against the control, rather than plain pairs.
Worked example: five treatment means
Five treatments, four replications, with the error mean square and error df carried from the ANOVA, in the format StatVeda accepts:
# MSE: 4.567 # dfError: 12 T1: 25.4, 4 T2: 31.2, 4 T3: 22.1, 4 T4: 28.7, 4 T5: 30.5, 4
On the same five means, the four tests produce different critical differences. The pattern is illustrative, but the ordering is always the same:
Test Critical difference Pairs declared different LSD narrowest most DMRT graded by range intermediate Tukey HSD wider fewer Scheffe widest fewest
T2 versus T3 (a large gap) is declared different by every test. A borderline pair such as T1 versus T3 may be significant under LSD but not under Tukey or Scheffe. That is exactly the decision the test choice controls, and why it must be made on principle, not by picking whichever test makes the most pairs significant.
How to pick before you report
Decide the question first. Plain pairwise comparisons among a few treatments, with a significant F, justify the protected LSD. Many treatments and an agronomic audience expect DMRT. A claim that needs to survive scrutiny on every pair wants Tukey. A contrast among groups of treatments wants Scheffe. Pick once, before seeing which test flatters the result.
Common mistakes
Running LSD without the protecting significant F, or with many treatments, which inflates false positives. Trying all four tests and reporting the one that makes the most pairs significant, which is error-rate shopping. Using Scheffe for simple pairwise comparisons, where it needlessly hides real differences. Quoting a single CD value when the test produces a graded set of critical ranges (DMRT). Forgetting that mean separation is only valid after a significant overall F.
When the F is significant but no pair separates
Under a conservative test this can happen: the overall F is significant but no individual pair clears the wider critical difference. That is not a contradiction. It means the evidence is spread across the treatments rather than concentrated in one pair. Reporting the F honestly, with the conservative test result, is better than switching to LSD to manufacture a separation.
Run Multiple Comparisons (LSD, DMRT, Tukey) on your own data
Paste your data, get the ANOVA / biplot / GCA matrix in seconds, with a plain-English interpretation. 14-day trial, no card.
Sources
- Steel, R. G. D. and Torrie, J. H. (1980). Principles and Procedures of Statistics: A Biometrical Approach, 2nd edition. McGraw-Hill, New York.
- Gomez, K. A. and Gomez, A. A. (1984). Statistical Procedures for Agricultural Research, 2nd edition. John Wiley and Sons, New York.
- Carmer, S. G. and Swanson, M. R. (1973). An evaluation of ten pairwise multiple comparison procedures by Monte Carlo methods. Journal of the American Statistical Association, 68(341), 66 to 74.