This is an AcI construction (accusativus cum infinitivo), which Latin did quite a bit but German (and English) also do a little.
The main verb in English or German is usually one of perception (see, hear, feel), and it's followed by a noun or pronoun in the accusative case and then by a verb in the infinitive.
The accusative-case object of the main verb is the logical subject of the infinitive.
So in Ich sehe ihn kommen "I see him come", the person who is coming is er/"he".
Or Ich höre den Hund bellen "I hear the dog bark", the thing which is barking is der Hund/"the dog".
Side note: Latin also did this with a lot more verbs such as "know" (the equivalent of Ich weiß ihn ein Buch lesen/"I know him read a book" would mean Ich weiß, dass er ein Buch liest/"I know that he is reading a book").