1. Forum
  2. >
  3. Topic: Irish
  4. >
  5. Let's talk about the audio


Let's talk about the audio

Hey y'all. As I'm sure some of you have seen, there's recently been an update from Team Irish. They talk about the number of reports, and how they are going down. Congratulations to them for getting the kinks worked out. However, they still haven't addressed the main issue. As you can see (here, here, here, here as well as in other threads asking how to correctly pronounce broad/slender distinctions) the audio is very inaccurate. They have assured us that steps are being taken, but they have refused to firmly admit that the audio is incorrect, often attributing it to "dialect", or to state explicitly what is being done, and often attention is directed towards other parts of the course.

Normally, I wouldn't fuss so much about the audio, but, unlike other courses, the Irish course uses a real speaker. This means people are going to be a lot more inclined to trust the audio than they would a TTS system. Yet, the audio is horribly wrong, and is clearly not from a native speaker, or someone with even a competent level of Irish.

I am by no means devaluing the course, or the hard work the contributors, especially AlexinIreland, who contributed over half the course, have put in. In fact, I already recommend it, though I suggest doing it without audio. I also wish to applaud the moderators and contributors on their hard work on the course, and understand that they didn't have anything to do with the audio (apart from confirming it), but, seeing as they're in closer contact with the Duolingo team than any of us are, I would like a concrete answer about what is being done. As has been repeatedly stated, we will take it upon ourselves to find people who are equipped to give proper pronunciations, and will either find volunteers or crowdfund so as not to cost Duolingo any more money, but I feel something needs to be done, as we're short-changing both learners and the language by leaving it like it is.

September 18, 2014



Definitely agree with the point about TTS and trust, that's the worst part of it. When I hear the robot Italian lady's voice, I take it as a rough guideline, not as the real, 100% correct pronunciation - like, "Yeah, thanks for the suggestion, Francesca... -cough-WRONG-cough-"
But when you know a real, live person was probably paid to record the Irish voice... then who am I to argue with it? Surely a real human knows what they're doing.
But even I can hear that it's strange and not the way the people on the radio speak. :/


Totally agreed. In fact, if you look at the Irish forums, you'll see that the recorded voice is a major selling point of this course. People are super excited about how "clear and accurate" it is. They put a greater mount of trust in it, because it is a real recording of a person. And clear it is, but if it isn't accurate, then it's misleading people. Not because the Irish course is bad, quite the opposite: because everything else about it is so good.


We did not select the voice artist ourselves, but we have been assured that she is a native speaker. Duolingo have been collating error reports for the audio and a batch of re-recorded sentences have gone live in the past few days. If you come across sentences which sound off, keep hitting the "report" button to flag the issue.

This whole area of recorded audio is new for Duolingo, and the Incubator interface wasn't directly built to accomodate it. Audio is instead directly handled by Duolingo tech staff, as all that has usually been involved was implementing text-to-speech software, with course contributors only rarely getting involved to flag individual sentences for removal where the automated speech synthesis happened to be confusing. The process will necessarily be more complex for individually-recorded audio strings.


Whether she calls herself a native speaker or not isn't really relevant (we've discussed heritage and continuity speakers in the original thread https://www.duolingo.com/comment/4305589). The point is that she's approximating Irish sounds with English ones and it's unacceptable for learners. Having her re-record sentences isn't addressing the problem at all because the problem isn't her mispronouncing something here and there, but rather a consistent lack of understanding of natural Irish phonology. For those interested see here for more discussion -> http://bit.ly/1rCPKOI


I'd be more than happy to donate for new audio, or whatever needs to happen. I hope Duolingo realizes there's a lot of us willing to crowdsource for it.


Me too, but given that we've had no response from them despite talking about this issue for over a month, I don't feel very hopeful. I even have native speakers I could try to talk into doing it, despite the fact that I would donate to a fund for good audio.


I just spent last weekend at a three-day Irish immersion course at the United Irish Cultural Center in San Francisco. There were four native speakers there, representing the different dialects. Not one person pronounces words as does the woman's voice in Duolingo's audio. It came as quite a shock to me, in fact, and I'm signing up for their weekend course, in fact, just to hear how it's really pronounced. One of the more accomplish students, who's been studying the language for more than ten years, including in Ireland,6 said the Duolingo audio was laughable. But then another long-timer said it was good enough. And that's fine with me, as I'm quite fond of this program! I'll hunt down the forum discussion about where she's from. It'd be helpful, I think, if this were made clear from the get-go somehow, tho'.


Here's a quick sketch of an interim solution. (warning: stream of thought software architecting follows.)

User part
Using Greasemonkey or equivalent, it should be possible to create a script which a Duolingo could install in their browser. The script would activate during lessons. Now, each sentence/exercise has a unique identifier (hash). The script could take this ID and send it to a server which matches the ID to a sentence. The server then replies with one or more URLs pointing to recordings of the sentence (using an API like for SoundCloud, or possibly direct links to audio files). The script would then present the audio clips as an alternative to the official one already present in the exercise.

Contributor part
Of course, this only works if there are actual audio clips available for our use. Hence, the script also needs to have functionality for native speakers to contribute audio. The script would add a button to each exercise, saying "contribute audio". Clicking on the button pops up a form dialog where you can input the URL pointing to your audio recording. Submitting the form would add the mapping from the current sentence ID to URL to the database, possibly flagging it for review. Only reviewed audio would be heard by the users.

This still leaves the question of quality control open. Who should review the audio? Also, would there really be any contributors? Would the recordings be any good?

Learn Irish in just 5 minutes a day. For free.