Literacy Glossary
What are diphthongs? Vowel sounds that glide
A definition you can quote
A diphthong is a vowel sound that glides from one position in the mouth to another within a single syllable. The word comes from Greek — diphthongos — and literally means “two sounds.” When you say cow slowly, your mouth starts in an open “ah” position and finishes closer to an “oo” position. That motion is the diphthong. A steady vowel doesn’t do that; the /a/ in cat keeps the mouth in one place from start to finish.
Linguistically, English has more diphthongs than most phonics programs teach explicitly. Long-i (as in bike) glides from an “ah” toward an “ee” — a true diphthong. Long-a (as in cake) glides similarly. But classrooms label those as long-vowel patterns and reserve the term diphthong for the two gliding sounds that don’t fit cleanly into the long-vowel system: /ow/ (as in cow) and /oy/ (as in boy).
For teachers and parents, the technical linguistics matter less than the practical mapping. In structured-literacy phonics, “diphthong” means /ow/ and /oy/ — full stop. Those are the patterns that get their own lessons, their own decodable books, and their own slot in the scope and sequence. Treating them as a small, contained unit is what makes them teachable.
A practical test: hold the sound. If you can stretch the vowel without your mouth changing shape, it’s a steady vowel (short or long). If the vowel forces a glide and you cannot hold it without splitting it into two distinct sounds, it’s a diphthong. Try it with cat (steady) and cow (glides). The mouth motion is the diagnostic.
The main diphthongs
Two diphthongs, four spellings.
/ow/ — the sound in cow, out, brown, mouth. Two spellings:
- ow — cow, now, brown, town, down, clown
- ou — out, sound, mouth, cloud, house, found
/oy/ — the sound in boy, coin, toy, point. Two spellings:
- oy — boy, toy, joy, enjoy, destroy
- oi — coin, boil, point, avoid, soil, noise
There is a strong positional pattern for /oy/: oy appears at the end of a word or syllable, oi appears in the middle. A student who internalizes that rule can spell almost every /oy/ word correctly on the first try — boil and boy, point and joy, avoid and destroy. The /ow/ spelling distribution is similar but less reliable: ow often appears at the end (cow, now, brown), ou often appears in the middle (out, sound, mouth), but there are enough exceptions (down ends in ow but isn’t a one-syllable end-of-word pattern in the same way) that the rule is taught as a tendency, not an absolute.
Worth noting: long-i and long-a are technically diphthongs in linguistics — they glide. But they’re taught under the long-vowel umbrella in K-2 classrooms because they fit the long-vowel patterns (silent-e, open syllables, vowel teams ai/ay) that students learn much earlier. Calling them diphthongs in a 1st-grade classroom would only create confusion. Stick with /ow/ and /oy/ as the working definition.
How diphthongs differ from vowel teams
This is the distinction students and new teachers most often miss. Both diphthongs and vowel teams use two letters to make a vowel sound. But the sounds behave differently.
| Vowel teams | Diphthongs |
|---|---|
| Hold one steady sound | Glide from one sound to another |
| Mouth stays in one position | Mouth changes position mid-sound |
| Examples: ea (seat), ai (rain), oa (boat) | Examples: ow (cow), oi (coin) |
| Taught after silent-e, in 1st-2nd grade | Taught after vowel teams, in 2nd-3rd grade |
Say seat and cow in slow motion. Seat — your tongue stays put for the /ē/. Cow — your tongue and lips move from an open position to a more rounded one. That motion is the structural difference, and it’s why diphthongs sometimes feel harder to “hear” as a single vowel: they don’t sound like one steady thing.
For instruction, the distinction matters mostly because of decoding flexibility. The grapheme ow can be a vowel team (long-o, as in snow) or a diphthong (/ow/, as in cow). Same letters, different sounds. Students who treat ow as having only one pronunciation will misread half the words they encounter. Teaching diphthongs as a separate category — with the explicit instruction that ow has two pronunciations and the student should try one and switch if the word doesn’t make sense — is what builds the flexibility skilled decoders use without thinking.
The same is true for ou. Most of the time it’s the diphthong /ow/ (out, sound). But it can also represent /ŭ/ (touch, young), /ō/ (soul, though), /ŏŏ/ (could, would), or /oo/ (soup, group). Programs teach the most frequent pronunciation first (the diphthong) and introduce the alternates as “try the other sound” later, often during multisyllabic-word work.
When diphthongs are taught
Late in the standard phonics sequence. Most SoR-aligned scope-and-sequences put diphthongs after:
- Short vowels and common consonants (K)
- Consonant digraphs sh, ch, th, wh, ng (K-1)
- Consonant blends st, bl, str (K-1)
- Silent-e and CVCe patterns (1)
- Common vowel teams ai, ay, ea, ee, oa (1-2)
- R-controlled vowels ar, er, ir, or, ur (2)
Diphthongs typically appear in late 1st grade or 2nd grade in UFLI, and 2nd or 3rd grade in Wilson and IMSE. Amplify CKLA introduces them across 1st-2nd grade depending on the unit. The exact placement varies by program, but the principle is the same: diphthongs are taught after the high-utility patterns are solid, partly because /ow/ and /oy/ words are less frequent in early-reader text and partly because the spelling overlap with long-o (ow) and long-i sounds requires students to already be comfortable with flexible decoding.
A typical diphthong lesson covers one sound across both spellings — usually /oy/ with both oy and oi in the same week, then /ow/ with both ow and ou the following week. Cumulative review keeps both diphthongs and the long-vowel uses of ow in active rotation so students practice the “try the other sound” habit.
Mastery looks like: the student can decode unfamiliar /ow/ and /oy/ words on first encounter, can spell common words using the correct spelling (boy not boi, coin not coyn), and can flex between the diphthong and long-vowel pronunciations of ow without explicit prompting. Most students reach that mastery within 2-4 weeks of focused instruction plus ongoing cumulative review.
A note on connected text. Diphthong words appear less often than CVC, silent-e, or vowel-team words in early-grade texts, which means decodables matched to the lesson have to be deliberate about featuring them. Without matched text, diphthong instruction stays at the word-list level and never makes the jump to fluent reading. The fix is the same as for every other phonics pattern: practice the new pattern inside connected, decodable text the same week it’s introduced.
How Storytime works with diphthong instruction
- Diphthong-tagged decodable library — every decodable book is tagged with the patterns it uses, including /ow/ and /oy/. When a class moves into diphthong instruction, the platform serves books that exercise the specific spellings being taught while keeping the rest of the inventory in cumulative rotation.
- Pattern-controlled generation — teachers can generate new decodable text for any diphthong lesson. Storytime respects the pattern caps for the lesson, so a class working on /oy/ won’t see /ow/ words inserted into their text before /ow/ has been taught.
- Mini-games for diphthongs — Word Sort, Real vs. Nonsense, and Sound Slide all support diphthong practice with both spellings of each sound. Students drag boy and coin into the /oy/ bin, snow and cow into separate /ow/ pronunciations, and so on.
- Spelling-rule drills for positional patterns — the oy vs. oi end-of-word vs. middle-of-word rule is built into encoding practice. Students dictating boy and boil see the platform reinforce the positional logic rather than treat the spellings as random.
- Skill Tree per-pattern detail — diphthong mastery rolls up under the phonics pillar with separate per-pattern signals for /ow/ and /oy/. Teachers see which diphthong spelling each student is still missing, not just an overall phonics score.
- Flexible-decoding support in BookReader — when a student stumbles on cow or snow, the platform’s read-aloud and TTS pronounce the word correctly so the student hears which pronunciation of ow applies in context, reinforcing the “try the other sound” habit.
Where to start
If you’re a teacher: don’t introduce diphthongs until short vowels, long vowels with silent-e, common vowel teams, and r-controlled vowels are solid for the class. Diphthongs are the dessert of the phonics sequence — wonderful, but only after the meal. When you do teach them, pair the sound with both spellings in the same lesson sequence and put the positional rule (oy end, oi middle) front and center. Students remember rules they can apply.
If you’re a parent: this is the part of the phonics sequence that often comes home as 2nd-grade homework. The most useful thing you can do is play “/ow/ versus /ō/” with ow words. Cow, snow, down, low, now, grow. Have the child read each one. If they get one wrong, ask them to try the other sound. That single habit — trying the other pronunciation when the first doesn’t sound like a real word — is the core skill the diphthong lesson is building.
If you’re working with a struggling decoder: check whether they’re treating ow and ou as a single, fixed pronunciation. Many students who learned long-o (snow) early and well then read cow as coe and out as oat without ever questioning it. The fix isn’t more diphthong drilling — it’s flexible-decoding practice with explicit instruction that ow can be two sounds and the reader chooses based on what makes a real word. Once that flips, /ow/ words tend to come together quickly.
Diphthongs are a small, contained piece of the phonics puzzle. Two sounds, four spellings, a positional rule, and a flexibility habit. Done well, the unit takes a few weeks. Done poorly, the same students who decode meat and boat fluently will stumble on cow and out for years. The difference is whether the instruction names the gliding sound explicitly, teaches both spellings together, and keeps the flexibility habit in cumulative review.
The good news: once those four spellings and the positional rule are in place, the diphthong unit is essentially closed. Students don’t have to come back and relearn it the way they sometimes have to circle back to vowel teams or r-controlled vowels. /ow/ and /oy/ are stable, low-volume patterns — easy to maintain in cumulative review without taking lesson time away from morphology and multisyllabic-word work in 3rd grade and beyond.
Frequently asked questions
- What is a diphthong?
- A diphthong is a vowel sound that glides from one position in the mouth to another within a single syllable. The word comes from the Greek diphthongos, meaning 'two sounds.' Unlike a steady vowel (the /a/ in cat) or a vowel team that holds one sound (the /ē/ in seat), a diphthong moves — your mouth changes shape in the middle of producing it. The two diphthongs explicitly taught in structured-literacy phonics are /ow/ as in cow and /oy/ as in boy.
- What are the main English diphthongs taught in phonics?
- Two: /ow/ and /oy/. The /ow/ sound has two common spellings — 'ow' (cow, brown, town) and 'ou' (out, sound, mouth). The /oy/ sound also has two — 'oy' (boy, toy, joy) usually at the end of a word, and 'oi' (coin, boil, point) usually in the middle. Linguistically, long-i and long-a are also diphthongs because they glide, but classrooms teach them as long-vowel patterns rather than under the diphthong label.
- How are diphthongs different from vowel teams?
- A vowel team holds one steady sound — 'ea' in seat is /ē/ from start to finish, with the mouth in one position. A diphthong glides — say cow slowly and you'll feel your mouth move from an 'ah' position toward an 'oo' position within the single vowel sound. Functionally, the distinction matters less than it sounds: both are taught as fixed letter-sound mappings the student learns to recognize and decode. The terminology distinguishes the linguistic mechanism, not the teaching routine.
- When are diphthongs taught?
- Late in the typical phonics sequence — usually 2nd or 3rd grade, after short vowels, consonant digraphs, blends, silent-e, vowel teams, and often r-controlled vowels are already solid. Diphthongs come last partly because they're less frequent than other patterns and partly because they share spellings with sounds students have already learned. UFLI introduces diphthongs in late 1st grade or 2nd grade; Wilson and IMSE typically schedule them in 2nd-3rd.
- Why do students confuse 'cow' and 'snow'?
- Same letters, different sounds. The 'ow' in cow is the diphthong /ow/ — a gliding sound. The 'ow' in snow is the long-o vowel team — a steady sound. English uses the same grapheme for both, so the student has to learn to try one pronunciation and switch if the word doesn't sound right. This is the same flexible-decoding strategy students use for 'ea' in meat versus bread. Teaching it as an explicit 'try the other sound' habit accelerates resolution.
- How is /oy/ spelled — 'oy' or 'oi'?
- Position determines spelling. 'Oy' appears at the end of a word or syllable (boy, toy, joy, enjoy). 'Oi' appears in the middle of a word or syllable (coin, boil, point, avoid). This is one of the most reliable positional spelling rules in English — students who learn the rule can spell almost every /oy/ word correctly. The same positional pattern is loosely true for /ow/: 'ow' often appears at the end (cow, now, brown), 'ou' often appears in the middle (out, sound, mouth), but the /ow/ rule has more exceptions than the /oy/ rule.
- How are diphthongs assessed?
- Like other late-stage phonics patterns: nonsense-word reading and connected-text fluency. A student who can decode 'foit' or 'noud' has internalized the diphthong-grapheme mapping; a student who reads them as 'fot' or 'nod' is still skipping the gliding vowel as a unit. In structured-literacy classrooms, diphthong mastery is also checked through dictation — the student writes a dictated word and the spelling reveals whether they've internalized the positional rules for 'oy/oi' and 'ow/ou.'