Suggestion: Ability to Apply to Be an Audio Contributor
I think it would be cool if native speakers could apply on Duolingo to read sentences, to replace the robotic voice we have now.
This would be excellent if implemented with some of the features of forvo.com, such as the accent map. It seems a lot to consider the idea of users pronouncing thousands of sentences, but other sites have examples of users who've pronounced clips in the tens of thousands.
It could work via a user-designated priority system, in which you order a few of these pronunciation volunteers by preference, so that when your preferred one is present, their pronunciation is used - and when they're not, it goes to the next preferred volunteer in your list.
You could implement a reputation system for a user, in which users can garner positive votes to show that people think well of their pronunciations. I think it'd be necessary to have no downvoting in a system like this, however, as I could inevitably see users downvoting pronunciations that aren't of their own standard (in terms of regional standards).
With a positive-only voting system, people with clear pronunciation of the most notable accents would likely receive the highest reputation, while those of lesser-known accents will not reach the same heights - which is okay, because learners tend to aim for one of the big standards, anyway. I wouldn't expect my Ayrshire Scottish English to be so desirable next to Received Pronunciation or General American pronunciations! :P But then, the fringe accents are there for those who do desire them! Some people want to learn a language to live in a country which isn't home to one of the big accents, after all, so that could be useful.
Space requirements for storing all of this, however, are inevitably significant, to put it mildly.
With librivox, you have to apply to be an audio contributor by sending in a sample recording so they can make sure you speak clearly and have decent audio recording equipment. Duolingo could implement a similar thing to make sure that the person speaks clearly enough (to ensure they're actually an improvement over the robot) and to make sure that their recording is clear and without background noise etc.
Unfortunately, Luis thinks user retention is more important than quality.
"Hi. We've actually tried a real human voice for the first 10 lessons (for English and Spanish, with professional voices), and people liked it less! (They returned to the site less.) Based on this we decided to work on improving other aspects of the site instead."
… and it's cheaper. (Which I suspect to also play a role :) ) At least with respect to the tremendous amount of errors in the German course, they can hardly argue that users like it that way.
I actually don't see any reason that it would have to be especially costly to use live voices. That could be crowdsourced in a manner similar to the immersion translations. Taking Spanish, for example, we know that some native English speakers "work the tree backwards" after completing the English to Spanish tree. DL could have anyone who is a native speaker or who has qualified by completing the opposite tree or taking a test then just record sentences as they are encountered in the backwards tree. By adding a voting feature for the audio playback like we have for comments, a selection algorithm could cycle through the available recordings (and substitute the TTS when none are available) and let the best rated recordings percolate to the top as well as providing a greater variety of voices to listen to. Something of this nature would allow them to - over time - capture audio of expert and native speakers without needing to pay for voice talent to record it.
Crowdsourcing the audio to random, non-professional voices is probably the only way to get an even worse result than the current situation… (And please don't quote forvo.com or similar sites. It's a big difference whether you just build an audio database for users to look up individual pronunciations or whether your users have to hear 50 sentences in a row with different accents, different mic settings, etc. This would most probably drive users crazy. If you use natural voices, they have to be professional speakers.)
Yeah, I have read it already. It is hard to argue: higher retention means that users also think that real voices are not always a hallmark of good quality.
I think, too many variables. Synthesizer has a huge advantage of clarity and stability. Few professional voices can boast a mild, gentle, "average" voice that a person cannot get tired of and than can sound neutral and unchanged for thousands of sentences. And that is exactly what is required if sentences are going to be played repeatedly, and in random order.
It would have been a more honest test if one ensured a professional voice was close to a TTS in that regard: neutral, pleasing, and doesn't get on your nerves when you hear it for a few hundred times, sometimes repeatedly. I suspect the results would have been be less surprising then: people would have likely preferred a real voce slightly more than a sampled one.
Or maybe people just associate too professional voices with dozens others language courses than are not that exciting. And subconciously feel bored and uneasy:) A valid scenario, though, again, only the statistics truly shows how it all affects thousands of users on average. I'd argue that if you keep studying, you progress is going to be better even with TTS instead of the real voice. If you hate the real voice and give up, you progress is likely to none at all, at least, on Duo :).
I suspect the outcome would also be different if they'd performed their test on the last 10 units instead of the first ones (i.e. with a body of more commited learners and less casual users).
Well, in the last 10 lessons you are likely to notice that the voice has suddely changed ^_^. For the first 10 lessons you can have the full experience without any suspicion whatsoever that something could have been different.
Anyway, that's not the priority for now. Recording sound makes sense only when the course is set in stone, and, unfortunately that didn't work. For En→Ru course it had become painfully obvious that some sentences are way too ambiguous, atypical or hard to translate naturally to be effective in early stages of teaching language. Which means we are going to add our own sentences as soon as we are able to. For some words original Duolingo sentences just don't have enough good examples to adequately show the use of the word.
I didn't want to suggest this to be the way to actually perform the experiment. I rather wanted to point out that their data is heavily distorted towards the needs (or better: liking) of casual users. They are constantly tailoring the site towards the satisfaction of this user group. IMHO this is the wrong way to go.
What I think might be a problem with such a thing is the varience of accents and ways to speak that could confuse the listener with inconsistancy - tho I have to say it would train the person to be more used to a variety of ways of speaking. I'm also guessing that bandwidth would also take a hit with that.
That's more of an opportunity. If they actually did this they'd be require to indicate where their from and then users could pick to only hear native speakers from a single area or native speakers from all over depending on their preference.
Would also be pretty cool if they ever bring back vocab and then allow you to click on any word and have it spoken by any number of different people with different accents so you could see regional differences.
Bandwidth likely wouldn't be a big issue for it either. They'd still in general just send one voice recording per sentence. Although managing it and approving all the different voices for inclusion would be a very big job.
Good idea. They would need to make sure that it's a standard accent all across and not a dialect that's not considered the standard pronunciation, though.
What would be great would be if anyone could add an audio (even if it was just in the comments section) so that the listener could hear different accents/voices.
Being able to hear other accents than the standard one could be useful as well. Say if you're preparing a trip to Australia.
It should probably be some information one has to provide along with the audio.
Mind, though, that my estimate is that each course has 5000-6000 different sentences as a bare minimum. To adequately represent different accents, a user should be able to hear more or less the same sentences all over the course read by the same people. Which, I guess, would require each audio contributor to pronounce at least a thousand or two sentences from all parts of the course.
A few thousand sentences isn't really that unreasonable. Getting someone to check and approve the recordings would be a huge pain.
They don't need someone to do that, the community does that already by reporting audio which sounds incorrect.
Here's 3 lingots for this great idea. I've been thinking it about it when I start the Albanian course.
It is not existent at the moment. I have applied for it and I am hoping to get a response as soon as possible.
I like the idea of providing learners with a better human touch along with different styles of pronunciations in their new language. However, I do not fully understand the feasibility and scalability.
I'd be happy to provide English sentences. I hear my midwest accent is the "norm" in news broadcasting.
Actually, I've had this complain for a long time ago but personally I would opt in for an alternative TTS instead of the current implemented TTS. I've already a brought a few TTS for android and they are all of great quality (in terms of clarity, naturalness, rendering, pitch, etc.).
It would be great if Duolingo would let us choose our own TTS someday... right now I'm staying away from the Italian course because I couldn't handle the synthesizer's pronunciation :(
It is a good idea. I am French and I finished the tree of 'English from French'. I try to finish the tree 'French from English' and there are some mistakes in the French pronunciation. I am not able to know if it is he same in English. I would also be interested in the pronunciation of a language by native speakers from different countries. But maybe, It would be difficult to implement by Duolingo ?
There are some mistakes in English TTS. First, some words are buggy (color, wrong, finished).
Second, in English some words can have two pronunciations depending on meaning or noun/verb distinction: Polish language - polish the floor, I like to read - I have read a book, live turtle - I live in Leipzig, That's my favourite record - We need to record everything, We need this cotract - Can you contract this?
In these cases the TTS seems to simply always chose one variant regardless of meaning and context.
I would tell difference between the stress in a phoneme of a word (noun/verb ) and the pronunciation. In English especially, stress a word is very important. When I speak it is very difficult to remember and to stress a word on 1 St, 2nd syllable depending on it is a verb or a noun or the number of syllable... A visual aid would be welcome in this case.
Fortunately, this is a basic course, so you have VERY few words where the placement of stress can vary. But it would be very nice of them to fix it.
"read" has different vowels, depending on if it is an infinitive/Present (pronounced [ri:d]) or Past/Past Participle (pronounces [red])
Genau. Ich stimme dir sehr. Ich hasse die Roboter-Frau, die haben wir im Moment.