#vision-language models

2026-01-01 Tue-Thu Van-Dinh*, Hoang-Duy Tran*, Truong-Binh Duong, et al. AAAI Workshop on AI for Scientific Research, 2026

A Vietnamese benchmark for evaluating single-image and multi-image reasoning on information-rich infographics.

2025-10-01 Yen-Linh Vu*, Dinh-Thang Duong*, Truong-Binh Duong, et al. VisionDocs Workshop at ICCV 2025

An investigation of region-level descriptions from the Describe Anything Model for visual question answering on text-rich images.