depth would be part of staging. u can imagine a stage with the sound payed to you. theres width, height, depth and listening position (the starting point of the stage, how far away youre from it)
image is more like able to distinguish where the instruments are playing from. when u play a track, u can clearly hear and imagine where the musicians are standing. eg) vocalist middle front, guitarist right, bassist left, drummer behind, saxophonist far right, pianist far left...etc etc lah, like that..