TTS Voice Used in Cymraeg
What do you all think of the voice used - IVONA Gwyneth? As it's a computer voice, there is nothing we can do for the vast part of sentences and words, and we get many reports of "Audio does not sound correct."
It seems pretty clear and understandable to me. It bugs me that she can't seem to pronounce 'eisiau' though. And I don't think I've ever heard a real person pronounce the 'w' in wnaeth. But I realize that you can't control the individual bits. That's something for the IVONA developers.
To be fair, the pronunciation of 'eisiau' varies a lot. In the wild you can come across eisiau, isio, isho, isha and even just 'sha. The last was in a pub in Gwynedd having asked for a 'peint o chwerw a glasiad o sbritser gwin gwyn, plîs' and then, as he turned to start the spritzer, he asked, 'sha rhew?' (would you like ice with that?).
I also find the voice clear and easy to understand. =) I am a bit concerned about the pronunciation too, though? She says some words differently to how I was taught in school - and it makes me worry about copying the pronunciation of new words that I haven't heard before. =)
It's not a super big deal, though - at the end of the day this course is still extremely helpful. Once we've got the basics we can always learn about pronunciation from real people / television / radio / etc...
There are lots of words where the pronunciation varies by area though, so they're never going to get it right for everyone. See the comments about 'eisiau' above, and even with something as simple as 'llaeth' my Mum would have said 'llaath' (long a sound) whereas in school they taught it to sound like 'llayth'
Maybe it's a problem on my end, but many sentences are missing the first couple words or part of a word. For example, in many sentences with "nac ydy", the audio starts off at the last "y" or skips it altogether. Some other odd things happen as well, such as dw i being pronounced as two syllables ("doow-ee"). Overall i like the voice, i can't comment on its accuracy, though. The most annoying bit is the missing beginnings of sentences, it makes the audio tests really hard when you're missing several words and have to guess what might have been said or what the other half of that word was.
Thanks for that feedback, I'll check out the 'nac ydy', we might be able to restructure the sentence. The reason for 'dw i' coming out as dooow-ee is because the TTS is generating the sound from seperate sound files for the 'd' and 'w', it seems that Ivona recorded 'rydw i' for 'I am' while we're using the abbreviated informal form in the course.
It's not all of the time, though, sometimes it's pronounced as "dwee", sometimes not. I'll pay more attention to see if i can see a pattern. I've been sending reports in when i find sentences that are missing audio for the beginning of the sentence. If those play fine for you, then maybe it's a problem on my end. But there are quite a few sentences (not just nac ydy) with that problem. I haven't seen anyone else mention it, though, which has struck me as odd.
I have to add that sometimes the download speed (bandwidth) heavily affects whether the first sounds of the voice are played or not. I experienced the same with Italian, German and English for Hungarian, too. I usually try to repeat the voice and for second time it plays clearer, especially in Italian where it seems to be digitized human sounds. I had many problems with eisiau, too, and the repeated play sometimes helped.
Of course some words are coming out strangely (eisiau for example), but then that happens in the French one too and that's a highly tuned course. Perhaps with tweaking all those kinks can be sorted. As a South Walian it's nice to hear such a neutral accent - I was afraid she would have a strong northern accent and I would find myself reluctant to copy her.
Mostly it is good. The one main problem I have come across so far is that when she says "dyn ni" she seems to pronounce "dyn" as though she is saying the Welsh word for "man" (the two words are spelt the same but my understanding is that the vowel sound should be shorter when saying "dyn ni").
Would it be improved by putting the apostrophe in front of abbreviated words - 'dyn ni - and so on? Very rarely seen in the wild now, and to modern eyes it looks rather pretentious, but perhaps it could be used to persuade IVONA Gwyneth to sort out her vowel lengths.
I have had a few problems sometimes sorting out her fydda from her bydda and so on when there has been no context as to whether the phrase is from a statement, a question or a negation. All in all, though, it is a pretty good neutral accent.
no, the apostrophe doesn't make a difference, the main issue is that the recorded phrase is 'rydyn ni, which is pronounced perfectly, whereas the only 'dyn' availble to the voice is 'dyn' as in man, which is a long 'y' sound. But later on with other verbs this is less of a problem. Despite the occasional quirks it's still better to have a voice than not.
Just a thought, thinking about how links and so on can be embedded in text as 'click here' and behind that is some great long URL which is what you are really clicking on...
Within Duiolingo could IVONA Gwyneth be given the text to speak as, for example, dỳn ni'n with the accented ỳ as a short vowel marker, while the text shown on the screen is simply dyn ni'n without the accent?
I definitely like the fact that you can play the audio at normal speed and at slow speed. And it's also great that I can hover over a word to hear just that word's pronunciation. Neither of those things work (yet) in the Irish course, so I'm really appreciating them here. Those are definitely huge advantages of using the computer voice.
Is the sound linked automatically through Duolingo, or are the course creators uploading individual word files? If the latter, maybe it's possible to manipulate the spellings in order to fix some of the really "off" words. I had "cranc" earlier as one of the "write what you hear" questions, and it was indistinguishable.
Let me begin by saying that I think Duolingo Welsh, including the artificial voice with it shortcomings, is great. Nevertheless, I don't see why a natural voice, or better still two or more voices from different regions, could not be used. Is it just that it is hard to find people to take all the time that would be necessary?
As far as I know, the robot voice is created by making one person say thousand and thousands of sentences, which are then cut apart into little word-sized or even syllable-sized (or smaller) snippets which are then pasted together.
So it goes back to a human voice in the end.
So conceivably one could find someone who would be willing and able to speak in such a neutral dialect as the person they took to create the IVONA voice from, and have them speak the sentences.
Though you're right that a big advantage of a robot voice is the instant ability to have all sentences - including new ones that course maintainers come up with later (or ones that were initially wrong and that get changed, thus creating a sentence that's different from any they had previously).
That was certainly the way things used to be done until fairly recently, before truly authentic sounding voice synthesis became available. Most modern TTS is able to read any text and use pronunciation rules to generate the sounds in a fairly natural speech pattern. Originally, recordings were used as a starting point for the voices, but I think that in most cases they are entirely generated from scratch from the rules.
With the former method, unless entire sentences were recorded you would often hear odd jerky gaps, mis-jointed sentences and odd intonation where separately recorded words were stitched together.
With various languages on Duolingo, real recorded spoken words have to used occasionally when the rules don't quite match what is required and you hear a different voice. Somewhat oddly, this most often occurs only with the single-word hints, whereas the sentence itself is still TTS.
Personally, I find the voice clear and quite pleasant and in no way robotic sounding. Things may sound very different with different devices. I use an iPad for Duolingo with (pretty good quality) bluetooth headphones. Anything really unnatural sounding would stand out markedly.
IVONA Gwyneth gets most of the sounds out correctly (although there are some exeptions in terms of vowel length, e.g. rendering short ɛ instead of long e). My main bugbear is in the suprasegmentals: she just doesn't seem to get the intonation correct. Is it possible to teach her some authentic phrase and sentence prosody, please? Diolch.
I can't comment on the accuracy of the TTS as a non-Welsh speaker, but I generally find the voice clear and understandable, and also very pleasant to listen to. It comes across as fairly natural to me, so on the whole I really like it.
I have encountered a few, very few, instances of artefacts/distortion/noise on the audio with certain words within certain sentences. I have reported those and attempted to explain the problem when they've appeared. Usually the problem happens for each repetition. I would say however that it is no worse than any other course I've done, at this stage.
The only other problems are with the website and beyond your control or influence, I'm sure. Using an iPad with the website the audio doesn't usually play automatically - and sometimes the beginning of sentences may be muted. Usually, these can be heard in full on the second or third attempt. So apart from the fact that there can sometimes be a significant time lag between tapping and the audio starting, this isn't a significant issue. Nevertheless, I really look forward to Welsh arriving on the app.
In regard the "other problem" with the lack of autoplay and missing first words / syllables, I can just agree. I use the site on Samsung SM-T800 Android tablet (always updated, now I think it has Android 5.1.1) and I experience the same problem. I thought it may be a connection / bandwidth problem but now I think it may be a site bug or "hidden feature". Second and third attempts are usually better, but the start of the audio, very often a whole word is muted or missing. For slow play it is more reliable.
BTW if I have a suggestion about the whole site, where should I report it?
I suspect (guessing) the problem with tablets is that the site is expecting desktop browser type 'click' events as opposed to 'tapping' events. Double-taps somehow translate to clicks, possibly.
I think it can be a connection bandwith issue with the audio on occasions but mostly not. It isn't usually an issue with the app.
The other issue about tablets (iPads) anyway is that originally/several months ago the site used to resize itself correctly to an iPad width in portrait mode. It no longer does that - too wide.
Regarding suggestions about the whole site - not sure, sorry. There may be some duscussion groups but what you really need is a site-wide reporting mechanism. I've not found one, but then I haven't looked terribly hard. Normally I use the app which works better but doesn't give direct access to discussions and although they recently introduced problem reporting I'm yet to be convinced that actually does anything!
I like that the audio can be played at a slower speed. However when I do the Strengthen Skills activity and am asked to translate from the audio I am usually baffled as I have forgotten the words and have no idea of how Welsh is pronounced. I usually play the audio over and over but even then I find it hard to make out what is being said. Also sometimes the first letter of the first word is hard to figure out.
If you go to www.ivona.com you can try listening to phrases using 'Welsh, Geraint' - that may sometimes be clearer to some ears than the 'Gwyneth' voice currently used by Duo. You can also go to the on-line dictionary at www.gweiadur.com and listen to real-voice recordings of many individual words.