Crowd Source Speaking

Obviously, DuoLingo is one of the best crowd-sourcing language learning programs out there. Crowd sourcing is a big part of the interface; it's how we translate the internet.

However, I've heard some complaints about the robot voice used in some lessons. This leads me to wonder: Why not crowd-source the voices?

DuoLingo could have certain expressions for lessons that people would have to say (in their native language, of course). Then, they'd rate another clip. If a sound clip gets an average rating of 4 stars after, say, 15 ratings, it would be used in the place of the robot voice.

They would keep a "revert to robot" option, just in case, and a listener could report a problem with the clip. Through listening to many dialects, listeners would be more prepared for real life situations. What do you think?

February 5, 2013


The potential issues people mention here (i.e. accents/dialects) seem not only relatively easy to resolve, but could also be INCREDIBLY USEFUL to language learners.

When a user opts to record a bit of audio, just have them set a few variables that let us know what region of what country their accent comes from, fluency (beginner, XX years, professional (instructor, tutor, etc), native), microphone type, gender, approximate age, etc. (and naturally store this information with their profile so they don't have to keep entering it)

Regions would actually be an excellent FEATURE, because then learners could chose to listen to audio with the accent of the region they're targeting. If I'm planning to spend a month in Rome, I could set the playback preference to that region. I'm moving to Milan? I could set playback to that region. And of course, those of you who love the computer voice could still set your preference to "roboregion". You could even go as far as to set your playback to a gender within the region or a specific user (if they recorded audio, otherwise default to the top voted). So I could set my default playback preference to "Italiano, Milan", "Italiano, Milan, Female", or "Giuseppe from Milan" - or more broadly "native speaker (best of all native speakers, regardless of region)", "professional", etc.

This information (region/fluency/etc) would be readily displayed next to the audio sample when we vote them up/down, and only samples from our chosen region would be displayed. The most highly voted clip in that region would be the default clip you'd hear when doing the lesson. Delete any audio with X number of down votes (to address the space issue).

Further native and fluent speakers could even be incentivised to record audio in some way - whether that's in the form of monetary incentives like points towards gift cards, or something as simple as special badges and such. Give additional points to users who's audio is voted up (to encourage quality).

Bottom line, if implemented similar to as outlined above, this could be an AMAZING addition, giving Duolingo learners a serious edge, and if I could vote it to the top I would. This would be my number one request.

February 7, 2013

Many of the languages here have myriad accents and pronounciations based on geography. Spanish is the most true of this, as the language differes across continents and countries. Portuguese and French have it almost as bad, with, for example, Canadian French being quite different than Euro French and African French both, as well as the differences between Peninsular Portuguese and that spoken in Brazil.

It would be the equivalent problem of someone learning English and learning from listening to voice examples of speakers from the US, from England, from Australia, from South Africa, from Scotland, from Ireland, from Jamaica and from Liberia. They all would be correct, but the variances could trip people up.

February 5, 2013

I simply think that trying to get people conversationally fluent is way, way beyond the scope of Duos' mission and capability. I'm not keen on learners evaluating one another's written translations (the "blind leading the blind") and most of us are just not capable of competently evaluating other's pronunciations. And. as lago, point outs, which dialect is "correct?" I could listen to someone's speech, and say "Well, that would be OK in Argentina, but not Mexico." I have enough trouble just in the US dealing with native English speakers from Boston, Texas, Alabama, New York, and so on!

February 5, 2013

I agree that it would be difficult due to the great number of accents and dialects each language has. Only few people actually speak "standard spanish", whatever that is. Even in Germany, which is really small, native speakers from the North have difficulties understanding native speakers from the South, so how should a non-native speaker understand either?

February 5, 2013

Isn't that the point? By hearing different pronunciations the learner's ear would become more accustomed to real life language. There could always be a 'revert to robot' option in case the learner really can't get it, just next to the option to play the voice more slowly

February 6, 2013

I agree with what you're saying. I've been learning Spanish, and although I can understand speakers from Spain, I still have a hard time understanding those from some South American countries. If I had had more variety in the people I spoke with, I'm sure I'd have an easier time understanding multiple speakers.

February 6, 2013

I would always prefer hearing a native speaker of ANY dialect to a computer.....

February 7, 2013

That's a great idea! It may not be practical, though.

It looks to me as though the exercises in the lessons are generated on the fly based on a student's previous errors and an algorithm that cycles vocabulary for each student based on their individual learning patterns.

So, the audio has to be generated on the fly as well -- hence the robot. The sentence I record might never be offered to you and vice versa. There are implications for storage and system response time.

February 5, 2013

So far i have not had a problem with the speech from the program other than subtle differences of Er to Ihr,.....But it will certainly add to the features and overall value of the program, It will likely be easier to keep the robot voice as the default, as a standard language to get use to, then also have the option for other pronunciations, when and where available....... I guess this will also need an indication of which region the speaker comes from....

February 6, 2013

I think the discussion about widely varying accents is spot on. I would find it chaotic to hear phrases spoken in wildly different accents. My ear will attune to the various accents once I have enough Spanish to talk with native speakers. As a Texan who's learned to understand English in Boston, Maine, Cockney English, and Belfast (unbelievably thick accent), I think I'll learn to handle Mexican, Argentinian, Castilian, and Cuban (also very thick accent).

I find the robot voice pretty understandable. The few times it's not, I've always understood the slower version. It may not be true now, but in the U.S., for many years, national TV and radio news anchors were chosen (or trained) to have a "neutral Midwestern" accent, which was deemed most understandable by most Americans. Seems to me that the Spanish voice robot was similarly given whatever accent Spanish speakers consider the most neutral.

February 7, 2013

So what is the status of this?

April 22, 2014
