A bit disappointed about the TTS
We've waited so long because of it, and frankly I find it to be a bit underwhelming. Tatyana from Ivona seems to pronounce words much more clearly, just copy & paste any of the sentences over on this site, choose the Tatyana voice, and compare for yourselves:
Of course, we should also keep in mind that the Duo team is a very talented one and the course is still in beta.
Tatyana's accent is weird. A bit Ukrainian, I think. I would hadrly select the voice with a non-standard accent of an unknown region. Maxim is what we considered, but since IVONA's synths make a huge number of mistakes, the choice was pretty obvious.
Note that by no means I want to imply that the voice we settled on is perfect. IVONA's voices sound less robotic. Which is hardly an advantage if you cannot pronounce "café" correctly.
sounds more like a glitch. It does not pronounce "кафе" and "Где вы завтракаете?" wrong, though. Tatiana has a few other glitches since it is an older voice (try "Я лежу на кровати и читаю").
SpeechPro has its share of glitches, and it is a pity that Yulia is obvious one of their most polished voices (other female voices are notably poorer in quiality and sometimes even break timbre").
Yeah, I was quite surprised when Ivona suddenly released it. Tatiana wasn't worth considering.
But them we ran ir through our test sentences... and it made quite a lot of mistakes. This is the primary reason why we chose the voice you hear now.
Also, a few non-native speakers said Maxim sounds a bit creepy and ominous:) For me, though, SpeechPro is, maybe, 70% ideal and Maxim 60-65%... hard to choose. If IVONA's analyzer was better or SpeechPro's Yulia sounded on about the same level as Tatyana, our choice would have been clear as day.
I have sometimes heard the TTS place the stress on the first syllable of "она" - but still keeping the pronunciation as if the stress was on the second syllable - so much so that I thought it was saying "Анна". But no matter, I'm loving it, and I thank the team again for all the hard work they put in.
The TTS has been a bit difficult at times? But I am pretty sure Duolingo had some difficulties with Ivona, so unless you wanted to wait until the end of time for an Ivona voice, I think this TTS does things fine. I accept not all the pronunciations are perfect and it is not really hindering my learning in any way. So just kind of deal with it.
I think it is Duolingo's policy to always use slow in the introductory lessons.
By the way, people have always complained how crappy the English voice was. For example, they absolutely swear that "cat" sounds like "cats", and "it" sounds as if it were "its". Also, "dog" and "duck" (and, of course "dogs" and "ducks") can be easily confused.
Be prepared to be a baaaad listener who does not distinguish "two" and "you". Encountering a foreign set of sounds is hard and even seems unfair at first. Always try to think "Does what I heard make any sense?"
Please tell me if you find anything better.:) I tested iSpeech, GoogleTranslate, IVONA, Nuance, AcapelaGroup and Loquendo, and all of them are rather mediocre. For all I know, you won't be much worse using RHVoice instead and learning from THAT (this free synth was written by Olga Yakovleva, who is a blind programmer).
Yandex also launched Text-to-Speech service recently. Which is not much good, as of now.
The thing is, you either get correct pronunciation (SpeechPro) or good "absolutely natural" voice (IVONA, Loquendo, iSpeech, maybe Google). Sometimes a good voice is also bundled with a poor accent (Nuance, IVONA's Tatyana).
One thing to consider is that a decent number of university students in the US and various other non-Russophone countries succeed (more or less) in learning Russian from non-native speakers. Correct pronunciation is great, but if the voice that produces it is using a pulse-train as its voicing waveform, people are gonna get really weirded out. Also, the аканье seems quite weird to me (a non-native Russian linguist). The voice is speaking quite fast, but it sounds like it only has one level of reduction. Jaworski (2010) ("Phonetic and phonological vowel reduction in Russian") found that a/o were longer when immediately before the stressed syllable than in other unstressed positions. It doesn't sound to me like that's a feature of your TTS system, but if it's an inhouse thing you might ask about this issue.
Try "молоко". Cannot really comment on the length because the rhythm is off anyway. There is sure room for improvement—I just don't think it is as easy as that (judging by their articles on building a corpus for synthesis, they do use 3 or more different positions for vowels).
No one uses anything as a voicing waveform these days. One way or another all synths use a large database of a recorded voice of a speaker.
Молоко sounds OK, but in the later lessons there are things (or at least, there were last night and this morning) that seem to have weird interactions between vowel length and talking speed and give off very bizarre percepts. I'll try to listen for some tomorrow.
Of course you don't use a pulse-train, early-morning doing-things is always a bad idea. Nonetheless, I definitely got some square wave percepts in places, like the waveforms of concatenated segments didn't have 0 amplitude at either end.