Spatial Intelligence
The Spatial Intelligence Group investigates how vision-language models perceive, reason about, and evaluate visual content at the spatial level. Our research examines whether multimodal AI systems genuinely ground their judgments in visual evidence or rely on linguistic shortcuts, with a focus on image quality assessment across structured quality dimensions. By combining attention analysis, interpretability methods, and established image quality metrics, we develop frameworks to measure and improve spatial grounding in large vision-language models. Our goal is to build AI systems that can reliably assess, compare, and explain visual quality — bridging the gap between human visual perception and machine understanding for applications in generative AI evaluation, automated quality assurance, and scientific figure analysis.

















.png)













