Has anyone else found themselves with a new TTS voice? I've had it for about a week now, and to be honest, it sounds pretty awful, kind of like a drowning woman. Anyone else have/mind it, or am I alone here?
I've heard they're testing a female voice. It's by the same company as the male voice, but I agree it's not quite the same quality.
Agreed, she usually sounds either tired or croaky. I'm not sure if it's permanent or not - could be an AB test - but either way, I'd like to be able to choose between them at the very least.
She does come across a little bit unmotivated sometimes, but I like the mix I hear right now. I feel a better training effect with the words spoken by different voices. She also does a way better job in making je/jij and ze/zij sound differently than the guy.
To my ears she doesn't even sound like a native dutch, but with a heavy accent, maybe Russian. Sure, the guy has some weird pronunciations going on sometimes, but she is kinda strange to listen to. Plus, as others have said, she has zero or less energy. As someone who works with recording voices every day I can say that this is not an acceptable result.
The male voice is 100% not synthetic, but the result of stringing together words recorded by a human. The female voice has no artifacts typical to synthetic voices, so it's most likely it was produced the same way.
Technology is evolving faster than you think and so you are 100% wrong. The voices are generated by A.I. and machine learning. This blog post describes it in more detail: https://aws.amazon.com/de/blogs/machine-learning/powering-language-learning-on-duolingo-with-amazon-polly/
True, they are generated. But they are generated from base recordings with real actors. The technology behind Amazon was developed by the company Ivona, which Amazon later bought. And I quote from their web site: "The voices inherit the persona of the original voice talents". Listening to the voices in headphones makes it clear that they are indeed based on real humans, since there's a noticeable variation in proximity, room echo and clarity in the sound files. Anyway, regardless how the voice was produced, it still has a strange accent and dull delivery, which is the whole point of this thread.