Found 1 project
Oral Presentation 2
1:15 PM to 3:00 PM
- Presenter
-
- Andre Ye, Senior, Computer Science, Philosophy UW Honors Program
- Mentor
-
- Ranjay Krishna, Computer Science & Engineering
- Session
-
-
Session O-2P: Large Language Models: Engineering and Social Requirements
- CSE 305
- 1:15 PM to 3:00 PM
I investigate the influence of cultural and linguistic backgrounds on visual perception and semantic interpretation within computer vision. This study addresses the question: Are there significant variations in the semantic content described by vision-language datasets and models across different languages? Guided by the hypothesis that cultural and linguistic diversities lead to distinct semantic interpretations, I compare multilingual datasets against monolingual counterparts. I developed metrics such as scene graph complexity, embedding space width, and linguistic diversity to quantify semantic variations across languages in both human-annotated and model-generated image captions. The methodology involves using linguistic tools and translation techniques to ensure semantic consistency across languages. Our findings indicate that multilingual captions contain, on average, 21.8% more objects, 24.5% more relations, and 27.1% more attributes than monolingual ones. Furthermore, models trained on diverse linguistic content demonstrate improved generalizability across different linguistic datasets. This study contributes to the understanding of how language and culture impact visual perception in computer vision and advocates for more inclusive dataset compilation and model training strategies.