Understanding Multimodal LLMs
A deep dive into how large language models are evolving to understand both text and images.
Read more →Doctoral student exploring multimodal vision-and-language systems, evaluation metrics, and context-aware captioning.
Long-form notes on multimodal research, evaluation, and community updates.
A deep dive into how large language models are evolving to understand both text and images.
Read more →This is my first post using Jekyll. Isn't it cool?
Read more →