Hello fellow Vietnamese learners! I had been writing many notes on pronunciation and other aspects of learning Vietnamese as I have progressed on my own journey, and here I share some of those notes in the hope that it is helpful to you. Feedback is welcomed.
The following is a rundown of vowel sounds in Vietnamese. I hope that the following information is useful in learning these sounds. I have organized this by the sound as opposed to the letter, since the sound of a language is where it all begins. Note that there are a few caveats to this: there are many differences in opinion in the literature on the sounds (I chose Glossika’s, as it is my main listening/pronunciation practice resource), the sounds are likely to vary a bit based on words structure, and some sounds will be more difficult than others due to their not appearing in English.
There are 12 vowels going by letters, though i and y are pronounced the same. The difference for I and Y is just that certain words and letter sequences use one or the other. As an interesting cultural tidbit, there had been several large scale efforts by the government to remove the y (except in proper names and other specific contexts). The number might then be expanded to 14 sounds, based on some variations in a couple letters. There are 3 sets of sounds differentiated primarily by length (the longer sound is denoted by a “:” in IPA).
Where English equivalents exist, those will be used. Where there are differences, I will provide an explanation of tongue position. For reference, the IPA chart for vowels shows where in the mouth the vowel is produced, with the right side being the back of the mouth and the bottom being the floor of the mouth.
"IPA: Vowels, https://www.internationalphoneticassociation.org/content/ipa-vowels, available under a Creative Commons Attribution-Sharealike 3.0 Unported License. Copyright © 2015 International Phonetic Association."
Sound type: near-open central unrounded vowel
Equivalent letters: a, ă This sound does not appear in most dialects of American English. The tongue is positioned near the base of the mouth and central in the mouth.
IPA: ə, ə:
Sound type: mid central unrounded vowel
Equivalent letters: a, â, ô (short); ơ (long)
Same sound as the “a” in “comma” or “Tina” in American English.
IPA: i, i:
Sound type: close front unrounded vowel
Equivalent letters: i, y
Same sound as the “ee” in “free” or “see” in American English.
IPA: ɯ, ɯ:
Sound type: close back unrounded vowel
Equivalent letters: ư
This sound does not appear in American English, though there is a rounded equivalent – the “oo” in “boot.” Rounded vowels are formed by making an “o” shape with your lips, so simply force your mouth to stay flat for this one.
Sound type: near-close near-back rounded vowel
Equivalent letters: u
This sound does not appear in American English. The tongue is near the upper back of the mouth. One of the closest sounds in American English is the “oo” in “boot” – the ʊ involves the tongue being ever so slightly more forward and lower in the mouth.
Sound type: open front unrounded vowel
Equivalent letters: a
This sound does not appear in American English. The tongue is located at the front and bottom of the mouth, right behind the bottom set of teeth.
Sound type: open-mid front unrounded vowel
Equivalent letters: e
Same sound as the “e” in “bed” or “met” in American English.
Sound type: close-mid front unrounded vowel
Equivalent letters: ê
This sound does not appear in American English. This sound comes from a tongue position slightly lower than the “ee” in in “see” in American English.
Sound type: open-mid back rounded vowel
Equivalent letters: o
Same sound as the “ough” in “thought” or “bought” in American English.
Sound type: close-mid back rounded vowel
Equivalent letters: ô
This sound does not appear in American English. This sound comes from a tongue position slightly lower than the “oo” in “boot” and closer to the back of the mouth than the ʊ.
Sound type: close back rounded vowel
Equivalent letters: u
Same sound as the “oo” in “boot” in American English.
There are 6 tones, and they are used to alter syllables. This contrasts with English, where tones are used primarily to express emotion. There is regional variation; here I describe northern tones.
- Ngang (a) – mid level – rather than being flat, it is a bit high and even
- Huyền (à) – low falling – starts a bit high, then drops
- Hỏi (ả) – mid falling – dips and rises
- Ngã (ã) – glottalized rising – rises, with a forced break in the middle
- Sắc (á) – rising – rises, sounds similar to a questioning tone in English
- Nặng (ạ) – glottalized falling – starts low, forced break at the end
To help with learning the tones, it might help to pair each tone with an action, to further burn it into your memory. Here are ones I have use for association:
- Ngang (a) – hands moving forward, just below eye level
- Huyền (à) – hands pushing down, starting from chest level
- Hỏi (ả) – dipping motion with either hand
- Ngã (ã) – forearms at shoulder level and rising, with a deliberate stop midway
- Sắc (á) – hands lifting up, starting from just below chest level
- Nặng (ạ) – dropping an (imaginary) stone from stomach level
For information on pronunciation - learning materials post.
Your e: example should be an ê not e.
Differences in Southern Vietnamese
Inevitably, as a result of 300+ years of intermingling with the Khmer, Hoa (Chinese) and Cham amongst other cultures, Southern Vietnamese has taken on board a lot of sound changes that give it its unique feel compared to the other regional dialects. The purpose of this isn’t to force you to speak In a Southern manner but so you can understand people who speak these dialects as they account for a slight majority of Vietnamese speakers. Most Vietnamese residing in the USA, Canada and Australia speak this dialect. As Ho Chi Minh city (Saigon) is the economic hub along with an important centre of education, culture and entertainment, it's handy to know the differences.
a: This letter is often pronounced as æ. Think of the word cat or can. It can also sometimes mutate into an ă sound. Sách and sanh are pronounced sắt and săn or xắt and xăn. The letter s can be pronounced similar to an English sh but more hissy. In the North it's merged with x-. Sanh is the Southern equivalent of sinh. The same happens for many words like chính phủ/chánh phủ and nóng tính/nóng tánh.
ê: This letter sometimes mutates into an ơ. E.g. lên is usually pronounced lơn in the South, kết is usually pronounced cớt. The mutation also happens for -êch and ênh although there are some exceptions. Ếch is pronounced ớt and mệnh is mợn. Bệnh (sick) is the written form but most Southerners still say bịnh (pronounced bựn).
i: This letter sometimes mutates into an ư. E.g. xin is usually pronounced xưn in the South, kít is usually pronounced cứt. <-- You just learnt a new word meaning poop! :) The same mutation applies to -ich and inh. Xích is pronounced xứt and xinh is pronounced xưn.
-c: This is usually -c but sometimes -cp (for -oc, -ôc, -uc). In other words you close your mouth as you pronounce those words such that óc will sound like óp or ócp.
ch-: This initial is different from the North. In the South you suck in more air. Your jaw drops lower.
-ch: This final is always -t in the South.
-n: This final consonant often mutates into -ng. In fact the only cases where it's still an -n is for: -anh, -ên, -ênh, -in and -inh. Those keep the -n but the preceding vowels mutate into, respectively: -ăn, -ơn, -ơn, -ưn and -ưn.
-ng: Note that -ong, -ông and -ung force you to close your mouth as you say the words so it ends up in an -m position. This is why ong (bee) and ông (man) sound a bit like om and ôm. They're -ongm, -ôngm and -ungm. This happens in all dialects. In the South there's an addition of -un which merges with -ung. It's similar to what happens with -oc, -ôc and -uc (and -ôt, -ut in the South).
-t: This is usually mutated into -c in the South with the exception of: -êt and -it (pronounced -ơt and -ưt). Also, -ôt becomes -ôôc (this sound only exists in the South, take the vowel ô and then add a -c. It's not -ôc (which has a short ô sound) but a long ô sound. Then apply the closing of your mouth to produce -ôôcp. This is why the number one is pronounced like môộcp in the South.
tr: This is different from the North where they merge ch/tr. In the South it's more like a 'jr/dr" sound.
v: This is often merged with d- and gi- and pronounced /j/ (like in yes) while in the North there is a three-way merger of d-, gi- and r- as /z/.
Southern Vietnamese tones
• Ngang (a) – mid level – similar to Northern
• Huyền (à) – low falling – similar to Northern
• Hỏi (ả) – low rising to high – starts low and rises to high with no breaks
• Ngã (ã) – low rising to high – starts low and rises to high with no breaks (merges with hỏi)
• Sắc (á) – rising – similar to Northern
• Nặng (ạ) – low rising to mid – starts low and rises to mid with no breaks
Thanks, I overlooked that. Thanks also for sharing differences with the southern dialect.
Also, <s> and <x> are different: the former is pronounced like Russian <ш> and Pinyin <sh>, i.e. a retroflex voiceless sibilant.
ɒ is actually the rounded vowel. The open back unrounded vowel is ɑ. Sometimes I wish IPA had more distinctive letters.
It is back as well; it looks like in copying it over a different character slipped in. It is matched with the correct symbol now, thanks for the catch.
I agree with you on the IPA symbols, although there are some interesting alternatives floating around if you have enough free time...one I like aesthetically is Lierean script. The downside is that they aren't always as thorough; the Lierean script does not have characters assigned for tone or diacritics. Something like Visible Speech is more thorough but ugly. In the end, the effort pursuing one or the other could probably be more efficiently spent through brute force memorization (unless you plan to learn several phonetically different languages).
Cảm ơn, kuah! I always have a hard time reading vowels and sounding them accordingly in Veitnamese. This will help me.
They sound very different to me, and I imagine they do to most Vietnamese speakers as well. The sounds are definitely reproducible for you - try playing the clip on a loop, then moving your tongue around until you can approximate the sound. I think though that it will be hard to remember in practice, and you might get more mileage from replicating word sounds you hear as opposed to vowels in isolation.