1. Forum
  2. >
  3. Topic: Duolingo
  4. >
  5. Natural Voice Bank


Natural Voice Bank

Duolingo already utilizes crowdsourcing for the translations, which is pretty cool. How about utilizing the same method for a natural voice bank?

The infrastructure is already in place with the speaking segment, as it already records and analyzes your voice. Why not add another type where it asks you to translate and speak the sentence in your native language. For example:

Translate and say in English: "Mon chat veut un cheeseburger."

You would then press record and say: "My cat wants a cheeseburger."

After the correct response from the analysis, this would then be added to the voice bank and utilized in the English course.

To resolve quality issues, ratings and reporting could also be applied here, which would then affect the level of utilization.

I think this would be the ultimate feature for Duolingo.

January 22, 2013



There are some potential problems, like the big differences between peoples' microphone setups and the fact that many use Duolingo via a secondary language and may have heavy accents.But I agree that it would be good to hear more voices than the computer generated one.


You're right, those are valid issues. Numbers could mitigate these issues, if an increase in participation via crowdsourcing produces better yields. It would be an interesting experiment.


but how do you ensure the accuracy of every one's pronunciation ?


Off the top of my head, a three level validation process could address this.

The first level is the software analysis, which already exists for the speech segment.

The second and third level would then utilize the crowdsourcing/community to rate and report quality. It makes sense for the second level validation to go through the native language base first before reaching the final/target language. For example:

With the second level, a natural english sample could be played to a native english user without accompanying text, and the user would need to translate this to the target french language. If the user finds any quality issues, they would report it.

The third and final level, the target language users would rate the quality of the voice to determine the ranking and frequency of utilization.

It would be difficult to achieve a 100% accuracy, but this also applies to the real world. Everybody's pronunciation is different. Having a broad range of natural voices would be ideal for learning, even if there are some mistakes. I would definitely prefer this over the auto generated voices.

Learn a language in just 5 minutes a day. For free.