Analysis and Evaluation of VLMs in multimodal scene understanding

Masterthesis