Since "soy" is the yo form of ser, there's an implied "yo" in front of it. There's no difference in the meaning, both mean "I am." I'm not sure how they differ in everyday conversations, but from my experience, novice learners usually use "yo soy," "tu eres," and the like where you don't need "yo" or "tu" to help with conjugations.
I know it may be confusing, but Duolingo can’t assign a certain voice to every single question. That would take an incredibly long amount of time, and remember, they also teach >20 other languages. Making sure the questions have the right voice is probably the least of their priorities- besides, what if you do encounter someone and you can’t tell their gender by their voice (which you shouldn’t do anyway)? You listen to the words. The words are far more important than the voice.
The voices don't always match the questions. The questions aren't designed specifically for listening, the same sentences are used in many exercises. Duolingo can't control every voice they have for TTS in questions. That would take an extremely long time, and must be impossible to code and assign specific voices to each question.
Listen to the words, not the voice. This is not a real person, and it would take Duolingo devs a long time to assign a gendered voice to every sentence, when they already provide us with an expansive free learning platform. Besides, in the real world, you won’t always be able to tell someone’s gender from their voice- some men have high voices, some women have low voices, and some people have indistinguishable voices. You have to ask, not assume. If someone says they’re a woman but you think they have a low voice, you wouldn’t call them a man just because of their voice, you’d call them a woman because they said so.
It would take less time and effort than many of the other things in the app, and could be used to quickly improve not just mismatched gender exercises but also many other audio clarity issues. These issues are not minor problems either: a learner needs to learn the rules before the exceptions in order to understand they are exceptions. If a person hears random gender voices for the first person gendered statements, they may not realize the differences on the words.