Avatar of Truong-Binh Duong

Truong-Binh Duong

AI VIETNAM

AI researcher working on multimodal learning, vision-language models, and robust visual reasoning

  • About
  • Publications
  • Research
  • CV

#vision-language models

Content tagged with "vision-language models"

ViInfographicVQA: A Benchmark for Single and Multi-Image Visual Question Answering on Vietnamese Infographics
2026-01-01 Tue-Thu Van-Dinh*, Hoang-Duy Tran*, Truong-Binh Duong, et al. AAAI Workshop on AI for Scientific Research, 2026
#Vision-Language Models #Visual Question Answering #Multi-Image Reasoning #Vietnamese AI

A Vietnamese benchmark for evaluating single-image and multi-image reasoning on information-rich infographics.

View
Describe Anything Model for Visual Question Answering on Text-Rich Images
2025-10-01 Yen-Linh Vu*, Dinh-Thang Duong*, Truong-Binh Duong, et al. VisionDocs Workshop at ICCV 2025
#Vision-Language Models #Text-Rich Images #Visual Question Answering #Document Intelligence

An investigation of region-level descriptions from the Describe Anything Model for visual question answering on text-rich images.

View
© 2026 Truong-Binh Duong.