https://www.duolingo.com/Arnauti

Known pronunciation errors in the TTS

March 2, 2015

33 Comments


https://www.duolingo.com/vivisaurus

Thanks for compiling this list, Arnauti! Some are a mystery to me (det=dom in this one case), and some are understandable (kör and kör are two different words, and we haven't had any luck in finding a TTS that can say "read" and "read" correctly either). It is hard to find a perfect voice, but this one makes fewer mistakes than the previous one, huh? =]

March 2, 2015

https://www.duolingo.com/Arnauti

It's much, much better than the previous one. eller liknande for el isn't good, but the other one said - out loud as bindestreck, so… The new one only has rare errors on less common words, whereas the old one made mistakes on some very common ones. I think especially de and bakom were unforgivable errors of the old voice.
And thank you vivisaurus for helping us get this new voice in place!

PS I didn't compile the list all on my own, we've been doing it together in the internal wiki.

March 3, 2015

https://www.duolingo.com/Yerrick

The answer is probably "not in the current system", but I wonder how possible it would be to mark these problem words somehow and replace their sentence recordings with sound files from a different TTS program?

March 6, 2015

https://www.duolingo.com/devalanteriel

I like the idea. But even if that is possible, I doubt its feasibility as a long-term solution. TTS systems tend to generate the speech on-the-fly (with caching to prevent unnecessary calculations), meaning that virtually any word can change when an update is applied to the system. And it may well be that the word is pronounced correctly when used in conjunction with other words (not including the el error here), since the correct pronounciation by necessity depends on syntactic analysis as well as prosodic information, both of which have to be derived contextually.

March 6, 2015

https://www.duolingo.com/MarkBorkBorkBork

Google does have an API for their TTS engine, which includes Swedish. There may be licensing issues, but it works, and as you can see, super simple to program.

March 6, 2015

https://www.duolingo.com/devalanteriel

Yup. I've actually used that myself (and the corresponding SR), although only the English part.

March 6, 2015

https://www.duolingo.com/MarkBorkBorkBork

I wish Duolingo which fix breaking ampersands in URLs though.

March 6, 2015

https://www.duolingo.com/Yerrick

Huh. Somehow, I'd assumed DL would pregenerate all sentences in their courses and just pull each file up when necessary. But I suppose there's nothing wrong with your method, either. It would save on storage, not to mention being more easily extensible to add new sentences.

March 6, 2015

https://www.duolingo.com/devalanteriel

I'm sure they do have some measure of caching. But I would also assume that the TTS is under active development, which will inevitably amount to occasional changes.

March 6, 2015

https://www.duolingo.com/devalanteriel

Thanks for commenting. Is the text input reparsed, or double parsed in any way? The det -> dom case would make sense if the det was converted into de in any way, either through text shortening or because det is pronounced de - which is itself pronounced dom.

Neither reason sounds particularly plausible, and the Occam explanation would be that it's simply an odd mistake. Just throwing it out there. All in all, a major improvement if you ask me. :)

March 2, 2015

https://www.duolingo.com/MarkBorkBorkBork

I contacted Ivona support with a link to this thread. Hopefully they respond.

March 5, 2015

https://www.duolingo.com/MarkBorkBorkBork

So they did reply back to me, and my message with a link to this thread has been forwarded to the developers behind the Astrid voice. I don't know how often they released updates though, nor how long it will take Duolingo do install any updates.

March 7, 2015

https://www.duolingo.com/HelenCarlsson

That's interesting! It could be a problem with words like "kör", "banan" and "planet" though, since the faulty pronunciations actually exist but mean something else.

The old voice had the right pronunciation for choir (kör) but not for drive (kör) and for the new one it's the other way around. I guess it's complicated to make the computer voice able to choose the right one for both cases.

March 8, 2015

https://www.duolingo.com/MarkBorkBorkBork

A lot of the errors can be fixed by marking up the text passed to the Ivona engine. I don't know how much integration work the Duolingo people have done, but it should be possible to fix most of the errors above by providing hints to the TTS.

March 8, 2015

https://www.duolingo.com/HelenCarlsson

You mean adding stuff like w role="ivona:VB"kör/w to get a soft "k" for kör = drive? I guess that must be fixed by the Duolingo team then? Anyway, that would be really great!

March 8, 2015

https://www.duolingo.com/MarkBorkBorkBork

Yes, it should be as simple as that.

March 8, 2015

https://www.duolingo.com/devalanteriel

One more: sporten sounds like spårten.

Skåningar need not complain. :)

March 7, 2015

https://www.duolingo.com/Arnauti

Thank you!

March 7, 2015

https://www.duolingo.com/H82or8

Love the new voice. Tack så mycket.

March 3, 2015

https://www.duolingo.com/davost

Tunnelbanan is pronounces as tunnel banana :) https://www.duolingo.com/comment/7307632

March 3, 2015

https://www.duolingo.com/ZaffDragonslayer

Might be the funniest mistake I've heard XD

September 4, 2015

https://www.duolingo.com/Arnauti

Thank you, I'm adding it to the list above!

March 3, 2015

https://www.duolingo.com/Super-Svensk

I have noticed that bön sounds kind of weird (like bö-ön), as if it is two syllables. I don't know if that really counts as a mispronunciation, though... :)

March 7, 2015

https://www.duolingo.com/HelenCarlsson

"Svans" in "Lejonet sitter på sin svans" is still incorrectly pronounced (long a instead of short).

March 8, 2015

https://www.duolingo.com/Arnauti

tack, jag lägger till!

March 8, 2015

https://www.duolingo.com/jarrettkong

Kanske is pronounced incorrectly as well.

March 16, 2015

https://www.duolingo.com/Arnauti

Not generally – was that in a specific sentence, do you remember which one? (sorry for late answer)

July 5, 2015

https://www.duolingo.com/PavelTchernof

"Fisken och brödet" https://www.duolingo.com/comment/6291688

The word "brödet" sounds "brönet" (normal speed). In slow speed it sounds ok.

March 17, 2015

https://www.duolingo.com/jdfromdublin

Tack!

March 2, 2015

https://www.duolingo.com/xaghtaersis

Good to know! Fortunately I listen to a lot of Swedish so I will hear the correct pronunciation anyway.

March 3, 2015

https://www.duolingo.com/Kaminegg

Don't know about the previous one's mistakes, but I know I am immediately pleased with the pacing and energy of the new voice. Huzzah!

March 3, 2015

https://www.duolingo.com/devalanteriel

Here's another one: senapen sounds like sénapen.

Though admittedly, the error is small.

March 5, 2015

https://www.duolingo.com/Aridochichimodo

There are a few glitches with the new voice (or it may just be my computer) but when there is the section of "type what you hear" the voice jumps and stops multiple times

March 6, 2015
Learn Swedish in just 5 minutes a day. For free.