"A human sits in a tree."
Translation:Homo in arbore sedet.
It's not the direct object. A direct object is something the verb is directly acting on. So if we had something like "Homo arborem pulsat," the hitting is done to the tree; or in other words, the tree is hit.
The sitting isn't done to anything; the person is sitting in the tree. You can't "sit a tree," nor can we say that the tree "is sat."
"Arbor-" is the object of the preposition "in," so "in" is going to govern what case "arbor-" is in. "In" takes the ablative when it's referring to a stationary location, so we have ablative "arbore."