Date

2026

Document Type

Dissertation

Degree Name

Doctor of Philosophy

Department

Computer Science

Committee Chair

Deepak Venugopal

Committee Member

Madhusudhanan Balasubramanian

Committee Member

Vasile Rus

Committee Member

Xiaofei Zhang

Abstract

Multimodal AI systems integrate computer vision, natural language processing, and knowledge representation. While deep learning has made immense advances in tasks such as Visual Captioning (VC) and Visual Question Answering (VQA), it is hard to decipher knowledge encoded within these models to verify, evaluate and explain the behavior of these models. In this dissertation, we propose to i) develop a probabilistic framework to evaluate uncertainty in captioning models using Markov Logic Networks (MLNs), a well-known statistical relational model ii) disentangle knowledge grained in fine-tuning from preexisting knowledge encoded in pre-trained captioning models using a Neuro-Symbolic extension of MLNs called Hybrid Markov Logic Networks and iii) understand the sensitivity and limitations of Vision Large Language Models (VLMs) in VQA when processing modifications to questions that are cognitively more demanding to process. In summary, our dissertation advances understanding and evaluation of multimodal AI systems.

Comments

Data is provided by the student

Library Comment

Dissertation or thesis originally submitted to ProQuest/Clarivate.

Notes

Open Access

Recommended Citation

Shah, Monika, "Advances In Understanding Multimodal AI Systems" (2026). Electronic Theses and Dissertations Archive. 3938.
https://digitalcommons.memphis.edu/etd/3938

Download

COinS

Archival Statement

This item was created or digitized prior to April 24, 2027, or is a reproduction of legacy media created before that date. It is preserved in its original, unmodified state specifically for research, reference, or historical recordkeeping. This material is part of a digital archival collection and is not utilized for current University instruction, programs, or active public communication. In accordance with the ADA Title II Final Rule, the University Libraries provides accessible versions of archival materials upon request. To request an accommodation for this item, please submit an accessibility request form.

Electronic Theses and Dissertations Archive

Advances In Understanding Multimodal AI Systems

Date

Document Type

Degree Name

Department

Committee Chair

Committee Member

Committee Member

Committee Member

Abstract

Comments

Library Comment

Notes

Recommended Citation

Archival Statement

Search

Browse

Author Corner

Libraries

Electronic Theses and Dissertations Archive

Advances In Understanding Multimodal AI Systems

Author

Date

Document Type

Degree Name

Department

Committee Chair

Committee Member

Committee Member

Committee Member

Abstract

Comments

Library Comment

Notes

Recommended Citation

Share

Archival Statement

Search

Browse

Author Corner

Libraries