Project Detail

Visual Internal Reasoning

Structured latent image tokens for causally grounded visual reasoning inside language models.

Quick Explanation

This project tests whether language models reason better when they generate internal visual latent states before answering, and measures that effect with causal controls.

MultimodalLatent ReasoningCausal Evaluation