Multimodal Evaluation & Metrics
Building reliable evaluation protocols for vision-language models, with a focus on grounding and cultural nuance in Japanese contexts.
Doctoral student exploring multimodal vision-and-language systems, evaluation metrics, and context-aware captioning.
Building reliable evaluation protocols for vision-language models, with a focus on grounding and cultural nuance in Japanese contexts.
Curating instruction datasets and training recipes that keep open-weight models aligned and deployable.
Teaching models to produce and consume structured outputs such as captions, diagrams, and graphs for real-world tasks.
arXiv preprint · 2026
The 1st Workshop on Multilingual and Equitable Language Technologies (MELT) · 2025
The 2nd Conference on Language Modeling (COLM) · 2025
Proceedings of the 2025 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 3: System Demonstrations) · 2025
Methodology for quickly constructing multimodal datasets tailored for Japanese vision-language models.
Proceedings of the 2025 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers) · 2025
Novel approach for visualizing complex legal document structures through diagram generation from text.
Sakana AI
Applied Research Engineer (Internship)
Jan 2026 – Present
National Institute of Informatics
Research Assistant
Jun 2024 – Present
Cierpa & Company
Engineering Internship
Jul 2024 – Mar 2025
RIKEN AIP
Research Part-timer
Jul 2022 – Aug 2024
OMRON SINIC X
Research Internship
Jun 2023 – Jan 2024
Tokyo Institute of Technology
Research Assistant
Dec 2021 – Dec 2022
NTT Research Institute
Summer Internship
Aug 2022 – Sep 2022
Future Corporation
Strategic AI Group
Feb 2021 – Mar 2022
2024 – Present
Vision and Language, Evaluation
Advisors: Naoaki Okazaki
Focusing on novel evaluation metrics for multimodal systems and cross-modal representation learning.
2022 – 2024
Vision and Language: Image Captioning
Advisors: Naoaki Okazaki
Developed context-aware image captioning models that generate descriptions based on user preferences.
2018 – 2022
NLP: Grammatical Error Correction
Advisors: Naoaki Okazaki, Masahiro Kaneko (Mentor)
Created improved evaluation metrics for grammatical error correction systems.