An Automated Pipeline for Constructing a Vietnamese VQA-NLE Dataset

Summary

This work presents an automated pipeline for constructing a Vietnamese visual question answering dataset with natural-language explanations.

The pipeline uses multiple language models for translation, generation, evaluation, and quality control, reducing the amount of manual annotation required to construct multilingual multimodal datasets.

My Contribution

I designed and implemented major components of the automated data construction and evaluation pipeline.

Resources

Paper
Code and dataset