Deep Learning for Individualized Treatment Effect Inference with Multimodal Depiction