Suggestion: Crowdsourcing audio
Kind of an out there suggestion.
One of the main problems I see stated over and over again is the problem with the quality of the TTS. I was thinking about ways I can get around this.
I know Forvo and RhinoSpike do something similar already - the problem is just that they don't have the sentences that Duolingo does. So, in either case it would effectively be starting from scratch. But, given a fixed corpus of sentences, why don't you just crowdsource the audio pronunciations? Ask native speakers to speak an utterance, give points for upvotes, default back to TTS?
I agree that the TTS isn't the best solution and that we eventually should have real voices. But I see problems with the idea of crowdsourcing the task. It works well on forvo and similar sites. But users there only look up words or phrases once in a while. On duo you hear the audio for each single sentence you come across. Crowd sourced audio would mean that you hear a different voice with different mic, different audio settings, different noice levels, different accents, different speed (...) for every sentence you work with. I doubt this would be an aesthetically pleasing experience. (Yes, aesthetics matter!). I'd very much prefer them to hire professional speakers to record the audio. (Ever listenend to a crowd sourced audio book with non-professional speakers involved? It's terrible!)
All depends on how the recordings are filtered. If there are criteria to meet concerning the audio quality, the pace, etc... first people will think twice about making their own audio recording (avoiding quick vandalism like in the current "immersion" tab"), and second the over all quality will be improved greatly. Honestly I don't think we need professionals when we have so many native speakers available here, but that's just me. As long as the end product is checked and double checked we should be fine.
Compliments on this novel idea. Regional dialects would be a big problem though. In the UK as well as in Italy (the language of which I learn) you only need to travel less than 100 miles and the dialect can change totally beyond recognition.
It's not a problem. Why not keeping all correct versions, regardless of the accents/dialects ? If it's accents, there is no need for anything, people have to get used to any kind of accent if they want to practice their listening skills for real. If it's dialects though, a simple label on the side of the audio would be great, because there are words and expressions completely different that you really can't mix up with the original language (ex : French and Quebec French).
That's not a problem, it's a feature! If people really want to get a grasp of the language that is something that must be tackled eventually and it is better sooner than later. I also study Irish (Gaeilge) and that language has been highly fractured linguistically. Learners from the beginning tend to pick a broad region and some, like me, even a more specific sub-dialect, to start their journey. But all learners of Irish just have to come to terms with the dialectal complexity. So long as the files have metadata describing where the person is from (city/state or province) and perhaps identifying their regional dialect, I see no reason why it wouldn't work very well for some students.
There would be need for a label specifying which dialect/accent is being used on the sentence. But I believe dialects refer more to the words or sentence structure, so if the speakers read the sentences by the letter there wouldn't be dialects at all, just different accents and maybe that can be acceptable. But maybe that could be a different section in the advanced branches of the tree. Otherwise, if accents are being used from the very beginning then they would have to provide both a neutral pronunciation and along with the accented one for the newcomers.
I see a problem with selecting good recordings by "upvotes": most of the people voting won't be native speakers, and won't be capable of evaluating whether the pronunciation is correct. We'll get recordings chosen according to how easy they are to understand rather than how correct they are.
For all the complaints about the TTS, I find it easier to understand than real native speakers, at least in Italian and German. (Italians, unfortunately, do not come with a "repeat this more slowly" button.)