Duolingo is the most popular way to learn languages in the world. Best of all, it's 100% free!

https://www.duolingo.com/chacham2

Using human voices instead of the computer voice

I see a lot of complaints about the computer voice. It'd be much nicer if we could have real people speaking the sentences. Having multiple people with varying accents would be even better.

Would it be nice if people could upload their own pronunciations? As the system already blocks renditions with egregious errors, a basic filter is already in place. An option could be added for people to choose a natural voice in lieu of the computer voice.

Of course, voices would need to be tagged with the gender and accent, and voted on for clarity and correctness. That could be both informative and fun.

3 years ago

8 Comments


https://www.duolingo.com/Xyde
Xyde
  • 11
  • 10
  • 9
  • 6

And how much data would that take? Human voice doesn't equal to the recording's clarity, and I don't think many have professional microphones out there. It's way more resources extensive than pitching from a simple TTS. And the intonation for each word is going to vary depending on the sentence so much that it can make the server bloat.

I'd say that developing better TTS would be the way to go. Cereproc and Evona have really great ones but Duo just isn't using them.

3 years ago

https://www.duolingo.com/_pinkodoug_
_pinkodoug_
  • 25
  • 11
  • 11
  • 7
  • 6

Actually, Duolingo has been using the Ivona (Carla) voice for the Italian course since December (they had been AB testing that voice since September). Hopefully they'll be using more Ivona voices in the future.

Edit to add: I think the more compelling argument against recordings, and one that you mention, is the quality and consistency of TTS output (when it's available) compared with non-professionally recorded, crowd-sourced audio. I don't think that data requirements (storage and bandwidth ) are a significant argument against it, though. Bandwidth wouldn't really be affected much since the TTS output is already delivered to the client as an mp3 ( example ). The additional storage requirements likely wouldn't make a huge difference to what they're already paying AWS.

cGlua29kb3Vn

3 years ago

https://www.duolingo.com/Xyde
Xyde
  • 11
  • 10
  • 9
  • 6

Good to know that Duolingo has taken more initiative in improving the audio output! I've tried the Italian course a year ago but given up because of the terrible TTS...I guess that I'll try it again someday.

Didn't knew about the fact that the voices were already delivered through mp3... Kind of disappointed about that. I'm currently using chrome's built-in TTS output for Memrise courses with no audio files attached and it works very well.

3 years ago

https://www.duolingo.com/Balaur
Balaur
  • 25
  • 22
  • 17
  • 17
  • 14
  • 12
  • 12
  • 10
  • 10
  • 9
  • 9
  • 8
  • 8
  • 7
  • 4
  • 3
  • 10

If this happens, perhaps Duolingo could use a service like this: https://rhinospike.com/ Maybe when sentences get approved by two or more moderators, they can be added into the course, gradually replacing the TTS. It'd be a slow process, but I guess it's better slow than never, right?

3 years ago

https://www.duolingo.com/Andrew48

I don't know if multiple voices would be a good idea--maybe just a male voice and a female voice--and multiple accents would be really confusing, unless you were able to pick which one you wanted. I would be willing to sit down and record a list of all the sentences and words for the English sentences on Duolingo, if I had a good microphone to record with. Consistency is important, however, to avoid confusing learners. It would drive me crazy to learn five pronunciations of the same word thanks to the audio being in different dialects. Livemocha had that to some degree, and though it wasn't a huge problem, it was a bit irksome.

3 years ago

https://www.duolingo.com/mwyaren
mwyaren
  • 18
  • 11
  • 10
  • 9

I remember once taking part in an experiment that required memorising a set of 20 Icelandic words using Memrise. in my version each word was accompanied by recording; I think these were human voices and not TTS because each voice was different, and honestly how many TTSes can be there for a language with ~330k speakers. for several days after the experiment I still had those words randomly ringing in my head, each one with different loudness, pitch, intonation, male, female, a kind of hellish cacophony. it felt like going insane. so NO, please don't introduce various voices. it's not that good for learning, not at the beginner level (and that's what most of us are doing here).

3 years ago

https://www.duolingo.com/jbailey88
jbailey88
  • 25
  • 16
  • 13
  • 12
  • 2

Ivona actually has two really good Icelandic TTS engines http://www.ivona.com/

3 years ago

https://www.duolingo.com/oscart182

The IVONA voices are excellent and realistic. Home recordings of native speakers would probably not be very good, and professional recording studio voices of real people would be costly and unreasonable.

3 years ago