Well it's impossible to answer correctly to a listening exercise (without text), when the audio is wrong. In text translation exercises, okay, wrong audio is confusing, but you have the right text to answer, but in listening comprehension without any text it's just stupid guessing. IMO the wrong audios should at least be excluded from listening comprehension exercises immediately, if not corrected.
But why not change the text to suit the audio? That way we would have some chance of getting it right.