1. Forum
  2. >
  3. Topic: Turkish
  4. >
  5. Sound bites and intonation.


Sound bites and intonation.

Okay, I'm quite new to Duolingo but I immediately noticed that the sound clips across different courses are not produced the same way. Some of them sound more like someone uttered the sentence in one go while others are obviously produced by attaching individual words together, each word being recorded separately (like you hear in automated banking applications that speak to you over the phone). Obviously the natural ones sound, well, more natural and more importantly, correct. In some cases unnatural clips are distorted to an extent that it becomes very difficult to understand what is being said even to a native speaker.

I wouldn't mind that at all if I hadn't failed to pass a test in my native language over a "type what you hear" test. This should NEVER happen to a native speaker.

My question is this: why is there two ways and is there any way to replace words-put-together samples with their natural and clear versions. That would significantly increase the quality of courses.

Please note that I am aware that this issue is not peculiar to Turkish for English speakers, it also appears in French for English speakers as well, but for example never in the Esperanto courses. On the other hand, the clips in English for Turkish speakers are significantly better thanks to more sound bites are naturally recorded than the the ones in Turkish for English speakers.

December 15, 2015



The issue is that there simply isn't a good Text-to-Speech software that has been developed for the Turkish language. Many other languages have these resources (English, German and Italian, for example). There simply just aren't the resources available to have a perfect TTS for Turkish (trust me...I wish there were). We have done our best to delete any listening exercises for sentences that are simply unintelligible, but our team can do nothing about this unfortunately. :) If you find a nice TTS, please let us know (but it simply doesn't exist).

  • 1872

The sound clips in most courses are TTS (text to speech), spoken text produced by a computer based on certain algorithms. The quality of these TTS-engines varies. You will find that the english language ones generally are better than those for other languages.

The fact is that Duolingo itself does not produce these TTS-engines and voices, they are bought from other companies. And of course, you can only buy that what is available in the stores.

I believe this is a problem that eventually will go away, as more and more companies will offer TTS-engines in different languages.

Sometimes some luck is involved. I believe that the Polish voice in the new Polish for English course is very good, because the TTS was produced by a Polish company for a Polish audience.


I quite agree with you and I'm relieved to hear that a native Turkish speaker has experienced this! I often have trouble understanding the Turkish clips precisely because of the issue you mention. I've heard a lot of native speakers in Turkey and it's obvious, even to a beginner like me, that the intonation is often wrong in the audio clips because they have been constructed like patchworks. I've also noticed that pronunciation changes significantly from one speaker to another. The Italian for English speakers course seems much more consistent so far, as the sentences are spoken 'in one go'.


I am relieved to know that native speakers are having difficulties with the sound as well. At first, I tried to do those listening exercises. I always got the answers wrong. Often I had absolutely no idea what I should write. It was really discouraging. After a while I turned the sound off. Now that I am reviewing my tree and I have a better idea of the grammar and vocabulary I have turned it back on, and I can manage. I thought that I was just really bad at listening! I'm glad to know that it's not just my lack of skill that is causing problems.


Thank you for replies. Now I understand the situation.

But isn't there any way to just use pre-recorded samples instead of relying on a TTS software? I am asking this because the samples used in Esperanto course definitely sound as if they are produced by recording real human speakers. Those samples are incomparably clean. I can hardly imagine there are better TTS engines (I doubt if such thing exists at all) for Esperanto language compared to Turkish. Please correct me if they are computer generated samples too because I may be mistaken as I just started learning it, even though I am fairly confident that they are not.

  • 1872

No, I think you are right.

The problem is that such recordings are quite an effort for the volunteers (and a cost for Duolingo). That has to be weighed against the added value of these recordings.

As long as the TTS voices are passable, those will be chosen. It might be a choice between a course with only passable TTS, or no course at all.

You also have to take in account that Duolingo offers introductory courses. The voice quality becomes more important with higher level courses. YMMV on this... :-)


Hm, got it. So can I deduce from your comment that since Esperanto lacks a passable TTS they used recorded samples for that course?

On the other hand, I disagree with you on the "voice quality" thing. Beginner or not, if a native speaker has to slow it down to comprehend or misunderstands a sound clip it is of little use to teach anything to a beginner. It may actually be even harmful in the sense that one learns wrong pronunciation and/or intonation from day one.

By the way, what does YMMV stand for?

  • 1872

YMMV: Your Mileage May Vary, an internet acronym that indicates that I expect you to disagree with me. :-)


Perhaps the volunteers producing the course could ask students on the "English for Turkish" speakers course to record any sentences on the "Turkish for English speakers" course which are poorly read by the TTS.

Learn Turkish in just 5 minutes a day. For free.