The words 'o urso' trip over each other unless I listen to the "slower" version. Just a tiny bit slower would help. The word 'urso' begins before the word 'o' is complete.
Based on Spanish, I'm guessing that the two form a diphthong and thus should create one phoneme and separating them would be inaccurate.
Why is "o" being pronounced different? This audio sounds incorrect to me. "O" shouldn't be ooooo as in the word "hood" instead it's supposed to be more like in the word "so" or same "o" in "molho" or "urso"