"It is seven o'clock."
7 has different readings, such as 「なな」 and 「しち」。 For seven o'clock, the reading 「しち」is used, so it is しちじ. So, while there seem to be quite some wrong readings by the TTS, this one is actually right.
Note: 4 o'clock is よじ instead of よんじ, and 9 o'clock is くじ instead of きゅうじ。The other o'clocks are "regular".
Meanwhile, the Japanese avoid the reading し of the number four, because it is a homophone of the word for 'death'. く (9) is sometimes avoided because it has the same sound as 'pain', but it is irregular in being kept when telling time. The reading なな for seven is preferred when しち is possible to be confused with one (いち) (e.g. over bad-quality phones).
Never the less they need to be consistent inn a particular lesson. Teach is the exact thing they are teaching us. Don't make it's regurgitate a particular pronunciation for a particular lesson, all the while having Audi attached that doesn't at all match the phonetic we are matching it to. They are doing this even when a particular pronunciation is wrong.
It is a bit ambiguous how to correctly say hours and minutes when building the sentence with the way the options are pronounced - since the Voice Over is always なな, even when it seems like しち should be used for the hour in this context...and I start to forget if it should be しち or なな for minutes...