While I understand why you'd think that, it actually is stressed at the right syllable. A native speaker should/would be able to understand the difference. The problem is that, because this is a question, the TTS recording has the voice go very high and, excessively IMO, emphasise the last syllable. But, overall, it is voiced correctly.
The sound is very blurred on some of these higher level recordings. I agree with previous comments that the first word sounds like και and the second like πόσα. I listen to the recording several times at normal speed and slowed down. It is frustrating. Part of the problem is lack of context. If this was presented as part of a conversation, it would not be so much of a problem.