R readabilitycheck v1
Pillar guide

Readability formulas explained

The complete guide to the six readability formulas in active use today — Flesch Reading Ease, Flesch-Kincaid Grade Level, Gunning Fog, SMOG, Coleman-Liau, and the Automated Readability Index. Where each one came from, what it measures, when to trust it, and how to use them together.

Score with all six at once Open the calculator →

What readability formulas actually do

A readability formula is a mathematical proxy for reading difficulty. It takes objective text features — sentence length, word length, syllable count, complex-word percentage — and combines them into a single number that predicts how hard the text will be for an average reader of a given grade level.

None of the formulas reads or understands text in any meaningful sense. None of them measures clarity, accuracy, persuasiveness, or quality. What they measure, with surprising consistency, is the cognitive load of decoding: how much working memory the average reader needs to hold the sentence together while figuring out what each word means.

The insight that powers every formula is the same: longer sentences are harder to follow because they overflow working memory, and longer or rarer words are harder to decode because they activate fewer prior associations. Combine the two signals, calibrate against a corpus of text that has been independently rated for difficulty, and you have a formula. Every readability metric in this guide is a variation on that single theme.

A brief history

The first attempts to quantify reading difficulty came out of education research in the 1920s, but the modern era starts in 1948 with two simultaneous publications: Edgar Dale and Jeanne Chall's word-list-based formula, and Rudolf Flesch's Reading Ease score. Flesch's was the first widely adopted readability metric, used by the Associated Press, the US military, and eventually almost every word processor.

1952: Robert Gunning, a business writing consultant, publishes the Fog Index. He coined the word "fog" to describe writing that obscures rather than communicates.

1967: E. A. Smith and R. J. Senter develop the Automated Readability Index for the US Air Force, the first formula designed for computer-driven analysis using only character and word counts.

1969: G. Harry McLaughlin publishes SMOG ("Simple Measure of Gobbledygook") in the Journal of Reading. SMOG predicts 100% comprehension instead of the more lenient 50–75% threshold most other formulas used.

1975: Two milestone publications. Rudolf Flesch and J. Peter Kincaid publish Flesch-Kincaid Grade Level for the US Navy — a recalibration of the 1948 Reading Ease formula onto a US grade-level scale. Meri Coleman and T. L. Liau publish the Coleman-Liau Index, the second character-based formula and the one that became standard for machine-driven content analysis.

None of these formulas have been meaningfully revised since publication. The math from 1948 still works because the underlying psychology of reading hasn't changed.

The two families

The six formulas split into two families based on what they count for word complexity:

Syllable-based

Character-based

The practical difference: syllable-based formulas can be fooled by unusual spellings, technical jargon, and proper nouns where the syllable counter underestimates. Character-based formulas sidestep that problem entirely but are slightly less precise for general English. For most writing the two families agree within a grade or two; where they diverge sharply, the divergence itself is informative.

Side-by-side comparison

FormulaYearOutputInputsBest for
Flesch Reading Ease19480–100, higher easierSentence length + syllables/wordSEO, marketing, Yoast
Gunning Fog1952Years of educationSentence length + complex %Business, policy
ARI1967US grade (rounded up)Sentence length + chars/wordTechnical, military
SMOG1969US grade levelComplex word countHealthcare, patient info
Flesch-Kincaid1975US grade levelSentence length + syllables/wordGeneral default, MS Word
Coleman-Liau1975US grade levelSentence length + chars/wordTechnical, machine analysis

Which formula for which use case

If you're writing for a general audience

Default to Flesch-Kincaid. It's the most validated, most widely understood, and built into Microsoft Word and Google Docs. Pair with Flesch Reading Ease if your team prefers a 0–100 score.

If you're writing for healthcare or patient information

Use SMOG. It's the AMA standard and predicts 100% comprehension, which matters when the cost of misunderstanding is medical. The American Medical Association recommends a SMOG of 6 for patient-facing content.

If you're writing for business or policy

Use Gunning Fog. It penalises jargon-heavy writing more aggressively than Flesch-Kincaid, which catches the kind of dense corporate prose that erodes comprehension without anyone noticing.

If you're writing technical documentation

Use Coleman-Liau or ARI. Character-based formulas are more reliable when the text contains technical terms, proper nouns, code identifiers, or anything else that confuses syllable counters.

If you don't know your audience yet

Score against all six and look at the consensus grade. Outliers are usually noise; agreement across formulas is signal. Use the calculator to see all six side by side.

How to use multiple formulas together

The most defensible readability assessment uses several formulas together. The pattern of scores across formulas tells you something each individual score cannot:

  • All six in the same grade band → high confidence in the result. Edit if needed; trust the number.
  • SMOG runs 1.5+ above Flesch-Kincaid → too many complex words. Vocabulary is the problem.
  • Gunning Fog runs above the others → jargon-heavy. Check noun choices specifically.
  • Coleman-Liau and ARI run above the syllable-based formulas → unusual spellings or technical terms. May not be a real readability issue if the audience is familiar with the vocabulary.
  • Flesch-Kincaid sits well below the others → sentence length is fine but vocabulary is dragging the rest. Tactic: word substitution.
  • One score is wildly out of band → likely a calibration artifact. Trust the consensus.

This is what the calculator on the home page does — it computes all six in real time and shows them side by side, letting you see the pattern at a glance.

What readability formulas don't measure

Readability formulas measure the cognitive load of decoding text. They do not measure:

  • Whether the writing is good. Hemingway scores extremely high; a paragraph of word salad with short sentences and small words also scores high.
  • Whether the writing is true. Misinformation is just as readable as accurate information.
  • Whether the writing serves its audience. A sixth-grade reading level isn't right for a journal of theoretical physics.
  • Whether the structure makes sense. A document can score perfectly while being incoherent at the paragraph or section level.
  • Whether the audience knows the vocabulary. A passage full of medical terms may score "very confusing" but be perfectly readable to nurses.
  • The reader's prior knowledge. Familiarity with a subject lowers effective reading difficulty in ways no formula can capture.

The formulas are useful precisely because they measure something objective and reproducible. They are dangerous when treated as a proxy for quality. The right mental model: readability is a constraint to satisfy, not an objective to maximise. Once your text is comfortably readable for your audience, every other quality dimension matters more.

Frequently asked questions

What are the most common readability formulas?

The six in active use are Flesch Reading Ease (1948), Flesch-Kincaid (1975), Gunning Fog (1952), SMOG (1969), Coleman-Liau (1975), and ARI (1967).

Which readability formula is most accurate?

No single formula is universally most accurate. SMOG is most accurate for healthcare. Flesch-Kincaid is the most validated for general English. Coleman-Liau and ARI are more reliable for technical writing. Score against multiple formulas and look at consensus.

What is the difference between syllable-based and character-based formulas?

Syllable-based formulas (Flesch, Flesch-Kincaid, Gunning Fog, SMOG) count syllables. Character-based (Coleman-Liau, ARI) count characters. Character-based formulas are more reliable for technical writing where syllable counters get confused.

When were readability formulas invented?

Flesch Reading Ease and Dale-Chall in 1948, Gunning Fog 1952, ARI 1967, SMOG 1969, Flesch-Kincaid and Coleman-Liau both 1975. None has been meaningfully revised since.

Why do formulas give different scores?

Each weights inputs differently — some emphasise sentence length, others vocabulary; some use syllables, others characters; some predict 50% comprehension, others 100%. The variation is design, not error.