The woman sounds very odd, indeed, very often... That's not exactly how people in Brazil speak
The audio pronounces quem as keem, then if you click on quem it's pronounced as kem. Which is closest?
Thought same and checked to be sure. It's ok. When 'who, what, which or whose' is a subject or part of the subject we don't use auxiliary verb. So 'who owns this bag' but 'who do you love most'.