1. Forum
  2. >
  3. Topic: Swedish
  4. >
  5. Known pronunciation errors in…

https://www.duolingo.com/profile/Arnauti

Known pronunciation errors in the TTS

March 2, 2015

33 Comments


https://www.duolingo.com/profile/vivisaurus

Thanks for compiling this list, Arnauti! Some are a mystery to me (det=dom in this one case), and some are understandable (kör and kör are two different words, and we haven't had any luck in finding a TTS that can say "read" and "read" correctly either). It is hard to find a perfect voice, but this one makes fewer mistakes than the previous one, huh? =]


https://www.duolingo.com/profile/Arnauti

It's much, much better than the previous one. eller liknande for el isn't good, but the other one said - out loud as bindestreck, so… The new one only has rare errors on less common words, whereas the old one made mistakes on some very common ones. I think especially de and bakom were unforgivable errors of the old voice.
And thank you vivisaurus for helping us get this new voice in place!

PS I didn't compile the list all on my own, we've been doing it together in the internal wiki.


https://www.duolingo.com/profile/Yerrick

The answer is probably "not in the current system", but I wonder how possible it would be to mark these problem words somehow and replace their sentence recordings with sound files from a different TTS program?


https://www.duolingo.com/profile/devalanteriel

I like the idea. But even if that is possible, I doubt its feasibility as a long-term solution. TTS systems tend to generate the speech on-the-fly (with caching to prevent unnecessary calculations), meaning that virtually any word can change when an update is applied to the system. And it may well be that the word is pronounced correctly when used in conjunction with other words (not including the el error here), since the correct pronounciation by necessity depends on syntactic analysis as well as prosodic information, both of which have to be derived contextually.


https://www.duolingo.com/profile/MarkBorkBorkBork

Google does have an API for their TTS engine, which includes Swedish. There may be licensing issues, but it works, and as you can see, super simple to program.


https://www.duolingo.com/profile/devalanteriel

Yup. I've actually used that myself (and the corresponding SR), although only the English part.


https://www.duolingo.com/profile/MarkBorkBorkBork

I wish Duolingo which fix breaking ampersands in URLs though.


https://www.duolingo.com/profile/Yerrick

Huh. Somehow, I'd assumed DL would pregenerate all sentences in their courses and just pull each file up when necessary. But I suppose there's nothing wrong with your method, either. It would save on storage, not to mention being more easily extensible to add new sentences.


https://www.duolingo.com/profile/devalanteriel

I'm sure they do have some measure of caching. But I would also assume that the TTS is under active development, which will inevitably amount to occasional changes.


https://www.duolingo.com/profile/devalanteriel

Thanks for commenting. Is the text input reparsed, or double parsed in any way? The det -> dom case would make sense if the det was converted into de in any way, either through text shortening or because det is pronounced de - which is itself pronounced dom.

Neither reason sounds particularly plausible, and the Occam explanation would be that it's simply an odd mistake. Just throwing it out there. All in all, a major improvement if you ask me. :)


https://www.duolingo.com/profile/MarkBorkBorkBork

I contacted Ivona support with a link to this thread. Hopefully they respond.


https://www.duolingo.com/profile/MarkBorkBorkBork

So they did reply back to me, and my message with a link to this thread has been forwarded to the developers behind the Astrid voice. I don't know how often they released updates though, nor how long it will take Duolingo do install any updates.


https://www.duolingo.com/profile/HelenCarlsson

That's interesting! It could be a problem with words like "kör", "banan" and "planet" though, since the faulty pronunciations actually exist but mean something else.

The old voice had the right pronunciation for choir (kör) but not for drive (kör) and for the new one it's the other way around. I guess it's complicated to make the computer voice able to choose the right one for both cases.


https://www.duolingo.com/profile/MarkBorkBorkBork

A lot of the errors can be fixed by marking up the text passed to the Ivona engine. I don't know how much integration work the Duolingo people have done, but it should be possible to fix most of the errors above by providing hints to the TTS.


https://www.duolingo.com/profile/HelenCarlsson

You mean adding stuff like w role="ivona:VB"kör/w to get a soft "k" for kör = drive? I guess that must be fixed by the Duolingo team then? Anyway, that would be really great!


https://www.duolingo.com/profile/MarkBorkBorkBork

Yes, it should be as simple as that.


https://www.duolingo.com/profile/devalanteriel

One more: sporten sounds like spårten.

Skåningar need not complain. :)


https://www.duolingo.com/profile/Arnauti

Thank you!


https://www.duolingo.com/profile/H82or8

Love the new voice. Tack så mycket.


https://www.duolingo.com/profile/davost

Tunnelbanan is pronounces as tunnel banana :) https://www.duolingo.com/comment/7307632


https://www.duolingo.com/profile/ZaffDragonslayer

Might be the funniest mistake I've heard XD


https://www.duolingo.com/profile/Arnauti

Thank you, I'm adding it to the list above!


https://www.duolingo.com/profile/Super-Svensk

I have noticed that bön sounds kind of weird (like bö-ön), as if it is two syllables. I don't know if that really counts as a mispronunciation, though... :)


https://www.duolingo.com/profile/HelenCarlsson

"Svans" in "Lejonet sitter på sin svans" is still incorrectly pronounced (long a instead of short).


https://www.duolingo.com/profile/Arnauti

tack, jag lägger till!


https://www.duolingo.com/profile/jarrettkong

Kanske is pronounced incorrectly as well.


https://www.duolingo.com/profile/Arnauti

Not generally – was that in a specific sentence, do you remember which one? (sorry for late answer)


https://www.duolingo.com/profile/PachaTchernof

"Fisken och brödet" https://www.duolingo.com/comment/6291688

The word "brödet" sounds "brönet" (normal speed). In slow speed it sounds ok.


https://www.duolingo.com/profile/xaghtaersis

Good to know! Fortunately I listen to a lot of Swedish so I will hear the correct pronunciation anyway.


https://www.duolingo.com/profile/Kaminegg

Don't know about the previous one's mistakes, but I know I am immediately pleased with the pacing and energy of the new voice. Huzzah!


https://www.duolingo.com/profile/devalanteriel

Here's another one: senapen sounds like sénapen.

Though admittedly, the error is small.


https://www.duolingo.com/profile/Aridochichimodo

There are a few glitches with the new voice (or it may just be my computer) but when there is the section of "type what you hear" the voice jumps and stops multiple times

Learn Swedish in just 5 minutes a day. For free.