// LLM Researcher · Tokyo

Chengguang Gan

LLM Researcher @ Techtouch · Ph.D. in Informatics

I build and study large language models for information extraction, web agents, and the Mutual Reinforcement Effect — and I ship the datasets and models to back it up.

Chengguang Gan
Tokyo, Japan
// 01

About

I’m an LLM Researcher at Techtouch in Tokyo. I earned my Ph.D. in Informatics in March 2025 from Yokohama National University, advised by Prof. Tatsunori Mori. My work spans natural language processing and large language models — with a focus on information extraction, prompting, multimodal IE, and web agents.

I introduced the Mutual Reinforcement Effect and built the Japanese (and later multilingual) IE Mix datasets — covering sentence, text, sentiment and POS classification, relation and event extraction — then fine-tuned a line of open LLMs on top of them. Everything ships: you can grab the papers, code, datasets, and models on Hugging Face and GitHub.

  • Large Language Models
  • Information Extraction
  • Prompting
  • Mutual Reinforcement Effect
  • Multimodal IE
  • Web Agents
  • Japanese NLP

Reviewer for NeurIPS, COLING, and COLM.

// 02

News

// 03

Publications

Bold = me. Citation counts via Google Scholar (Jun 2026).

  1. 2026

    MMM: Multilingual Mutual Reinforcement Effect Mix Datasets & Test with Open-domain Information Extraction Large Language Models

    Chengguang Gan, Qingyu Yin, Xinyang He, Hanjun Wei, Yunhao Liang, Younghun Lim, Shijian Wang, Hexiang Huang, Qinghao Zhang, Shiwen Ni, Tatsunori Mori

    ACL 2026 Findings arXiv
  2. 2026

    GuideWeb: A Benchmark for Automatic In-App Guide Generation on Real-World Web UIs

    Chengguang Gan, Yoshihiro Tsujii, Yunhao Liang, Tatsunori Mori, Shiwen Ni, Hiroki Itoh

    Preprint arXiv
  3. 2025
  4. 2025
  5. 2025
  6. 2025

    Exploring Behavior-Driven Development for Code Generation

    Yunhao Liang, Chengguang Gan, Ruixuan Ying, Zhe Cui

    ICIC 2025 Springer
  7. 2025

    Retrieval and Distill: A Temporal Data Shift-Free Paradigm for Online Recommendation System

    Lei Zheng, Ning Li, Chengguang Gan, Yong Yu, Weinan Zhang

    APWeb-WAIM 2025 Springer
  8. 2025

    M-MRE: Extending the Mutual Reinforcement Effect to Multimodal Information Extraction

    Chengguang Gan, Zhixi Cai, Yanbin Wei, Yunhao Liang, Shiwen Ni, Tatsunori Mori

    Preprint arXiv
  9. 2025

    Decoding Prokaryotic Whole Genomes with a Product-Contextualized Large Language Model

    Shiwen Ni, Sheng Li, Shijian Wang, Xin Bi, Yang Li, Chengguang Gan, et al.

    bioRxiv bioRxiv
  10. 2024

    Application of LLM Agents in Recruitment: A Novel Framework for Resume Screening

    Chengguang Gan, Qinghao Zhang, Tatsunori Mori

    Journal of Information Processingcited by 151 J-STAGE
  11. 2024

    II-Bench: An Image Implication Understanding Benchmark for Multimodal Large Language Models

    Ziqiang Liu, Feiteng Fang, Xi Feng, Xinrun Du, Chenhao Zhang, Zekun Wang, Yuelin Bai, Qixuan Zhao, Liyang Fan, Chengguang Gan, et al.

    NeurIPS 2024cited by 21 Proceedings
  12. 2024
  13. 2024
  14. 2024

    Demonstrating Mutual Reinforcement Effect through Information Flow

    Chengguang Gan, Xuzheng He, Qinghao Zhang, Tatsunori Mori

    Preprint arXiv
  15. 2023
  16. 2023

    A Few-Shot Approach to Resume Information Extraction via Prompts

    Chengguang Gan, Tatsunori Mori

    NLDB 2023cited by 16 Springer
  17. 2023
  18. 2023
  19. 2023
  20. 2021

    英文履歴書データ抽出システムへの BERT 適用性の検討

    Chengguang Gan, Yoshihide Takahashi

    IPSJ Kansai 2021 IPSJ
// 04

Experience & Education

Experience

  • 2025 — Present
    LLM ResearcherTechtouch
  • 2025
    Data ScientistGeneric Solution
  • 2024 — 2025
    Part-time ResearcherNII Large Language Model Center
  • 2024
    Research Part-timerRIKEN AIP
  • 2022 — 2023
    Research AssistantYokohama National University

Education

  • 2022 — 2025
    Ph.D. in InformaticsGraduate School of Environment and Information Sciences, Yokohama National University
  • 2020 — 2022
    M.S. in Information TechnologyThe Kyoto College of Graduate Studies for Informatics
  • 2014 — 2018
    East China Jiaotong UniversityNanchang, China
// 05

Open Source

Datasets, models, and code behind the papers — free to use.

More on Hugging Face and GitHub.

// 06

Beyond Research

When I’m away from the terminal.

// 07

Get in touch

Open to research collaborations and roles in LLMs, NLP, and agents. Best reached by email — or find me on the platforms below.

ganchengguan@yahoo.co.jp