Translation:The small blue car is moving in between those big buses.
It's not a beta/non-beta thing; it's to do with the fact that most courses use a text-to-speech (TTS) engine (a "computer voice"), but a few (e.g. Esperanto, Irish, Hungarian) have recordings of a real human speaker.
The computer voice can speak individual words as well as full sentences, and if a new sentence is created it can immediately have audio.
But live recordings don't record everything (there's a limit to the number of sentences that Duolingo pays for recording of), new sentences will not have audio, and there is only a single speed.