Chen Zhang (张辰)

Profile

About Me

I am currently an AI researcher at Meituan M17. Prior to that, I obtained my Ph.D. degree at Beijing Institute of Technology (BIT), where I was advised by Prof. Dawei Song. I collaborated closely with Prof. Benyou Wang from CUHK-SZ on efficient language models and Prof. Qiuchi Li from UCPH on structural bias. I previously worked closely with Dr. Jingang Wang from Meituan NLP and Dr. Qifan Wang from Meta AI on large language models. I also enjoyed building structure-grounded language models with Binyuan Hui from Alibaba.

My current research interests lie in the general area of natural language processing, particularly efficient language models and language agents. Previously, I was devoted to opinion mining and model generalization.

I am actively looking for self-motivated research interns who are interested in long-context language models or anything related to efficient language models. Drop me an email via chenzhang9702[AT]outlook[DOT]com if you are interested in collaborating with me.

Recent Highlights

July 30th, 2025. Our paper “Towards the Law of Capacity Gap in Distilling Language Models” got ACL 2025 Outstandaing Paper Award.

July 6th, 2025. One long paper got accepted to ACM MM 2025.

May 15th, 2025. One long paper got accepted to ACL 2025.

Jan. 22nd, 2025. One long paper got accepted to NAACL 2025.

Apr. 30th, 2025. Our team CUHKSZ-HUAWEI won the gold medal in AIMO 2025.

Sep. 5th, 2024. LongLLaVA is released to become the first large multi-modal model that could maximally process over 1,000 images with only one Nvidia A100.

July 25th, Aug. 8th, 2024. Invited to give talks respectively on long-context efficiency at Li Auto and democratization of LLMs at ByteDance Research.

Experiences

Projects

LongLLaVA
A hybrid-architecture large multi-modal model that becomes the first one who can process over 1,000 images with only one Nvidia A100.
[huggingface]

MiniMA, MiniMA-2, MiniMix, MiniLoong
A distilled language model family that establishes a new compute-performance pareto frontier among existing language models.
[github][huggingface][rank]

MiniChat, MiniChat-2
An instruction-following language model that achieves competitive performance with a small scale.
[github][huggingface][rank]

Phoenix
An instruction-following language model that is competitive with ChatGLM-6b.
[github][huggingface][rank][news]

WenJin
A large language model that reaches top-level performance on CLUE benchmark.
[rank]

Publications & Manuscripts google semantic

# indicates equal contributions.

Efficient Language Models

MoDification: Mixture of Depths Made Easy
Chen Zhang, Meizhi Zhong, Qimeng Wang, Xuantao Lu, Zheyu Ye, Chengqiang Lu, Yan Gao, Yao Hu, Kehai Chen, Min Zhang, and Dawei Song.
In NAACL 2025. [arXiv]

ZigZagKV: Dynamic KV Cache Compression for Long-context Modeling based on Layer Uncertainty
Meizhi Zhong, Xikai Liu, Chen Zhang, Yikun Lei, Yan Gao, Yao Hu, Kehai Chen, and Min Zhang.
In COLING 2025. [arXiv]

Understanding the RoPE Extensions of Long-context LLMs: An Attention Perspective
Meizhi Zhong, Chen Zhang, Yikun Lei, Xikai Liu, Yan Gao, Yao Hu, Kehai Chen, and Min Zhang.
In COLING 2025. [arXiv]

LongLLaVA: Scaling Multi-modal LLMs to 1,000 Images Efficiently via Hybrid Architecture
Xidong Wang, Dingjie Song, Shunian Chen, Chen Zhang and Benyou Wang.
Preprint. [arXiv][code]

Beyond the Speculative Game: A Survey of Speculative Execution in Large Language Models
Chen Zhang, Zhuorui Liu, Hanqing Zhang, and Dawei Song.
Preprint. [arXiv]

Towards the Law of Capacity Gap in Distilling Language Models.
Chen Zhang, Qiuchi Li, Dawei Song, Zheyu Ye, Yan Gao, and Yao Hu.
In ACL 2025, Outstanding Paper Award. [arXiv][code]

How Speculative Can Speculative Decoding Be?
Zhuorui Liu, Chen Zhang, and Dawei Song.
In COLING 2024. [paper][poster][code]

Task-agnostic Distillation of Encoder-Decoder Language Models.
Chen Zhang, Yang Yang, Qiuchi Li, Jingang Wang, and Dawei Song.
In COLING 2024. [arXiv][poster][code]

Lifting the Curse of Capacity Gap in Distilling Language Models.
Chen Zhang, Yang Yang, Jiahao Liu, Jingang Wang, Yunsen Xian, Benyou Wang, and Dawei Song.
In ACL 2023. [arXiv][slides][code]

On Elastic Language Models.
Chen Zhang, Benyou Wang, and Dawei Song.
In TOIS. [arXiv]

Minimal Distillation Schedule for Extreme Language Model Compression.
Chen Zhang, Yang Yang, Qifan Wang, Jiahao Liu, Jingang Wang, Wei Wu, and Dawei Song.
In EACL 2024 Findings. [arXiv][poster][code]

Sparse Teachers Can Be Dense with Knowledge.
Yi Yang#, Chen Zhang#, and Dawei Song.
In EMNLP 2022. [arXiv][poster][code]

Language Agents

Phoenix: Democratizing ChatGPT across Languages.
Zhihong Chen, Feng Jiang, Junying Chen, Tiannan Wang, Fei Yu, Guiming Chen, Hongbo Zhang, Juhao Liang, Chen Zhang, Zhiyi Zhang, Jianquan Li, Xiang Wan, Benyou Wang, and Haizhou Li.
Preprint. [arXiv][code]

Opinion Mining

PyABSA: A Modularized Framework for Reproducible Aspect-based Sentiment Analysis.
Heng Yang, Chen Zhang, and Ke Li.
In CIKM 2023 Demo. [arXiv][code]

Structural Bias For Aspect Sentiment Triplet Extraction.
Chen Zhang, Lei Ren, Fang Ma, Jingang Wang, Wei Wu, and Dawei Song.
In COLING 2022. [arXiv][slides][code][data][blog]

Aspect-specific Context Modeling for Aspect-based Sentiment Analysis.
Fang Ma, Chen Zhang, Bo Zhang, and Dawei Song.
In NLPCC 2022. [arXiv][slides][data]

Exploiting Position Bias for Robust Aspect Sentiment Classification.
Fang Ma#, Chen Zhang#, and Dawei Song.
In ACL 2021 Findings. [arXiv][slides][code]

End-to-end Emotion-Cause Pair Extraction via Learning to Link.
Haolin Song, Chen Zhang, Qiuchi Li, and Dawei Song.
Preprint. [arXiv][code]

A Multi-task Learning Framework for Opinion Triplet Extraction.
Chen Zhang, Qiuchi Li, Dawei Song, and Benyou Wang.
In EMNLP 2020 Findings. [arXiv][paper][code]

Aspect-based Sentiment Classification with Aspect-specific Graph Convolutional Networks.
Chen Zhang, Qiuchi Li, and Dawei Song.
In EMNLP 2019. [arXiv][slides][code]

Syntax-Aware Aspect-Level Sentiment Classification with Proximity-Weighted Convolution Network.
Chen Zhang, Qiuchi Li, and Dawei Song.
In SIGIR 2019. [arXiv][poster][code]

Model Generalization

Modular Retrieval for Generalization and Interpretation.
Juhao Liang, Chen Zhang, Zhengyang Tang, Jie Fu, Dawei Song, and Benyou Wang.
Preprint. [arXiv][code]

XPrompt: Exploring the Extreme of Prompt Tuning.
Fang Ma, Chen Zhang, Lei Ren, Jingang Wang, Qifan Wang, Wei Wu, Xiaojun Quan, and Dawei Song.
In EMNLP 2022. [arXiv][poster]

Making Pretrained Language Models Good Long-tailed Learners.
Chen Zhang, Lei Ren, Jingang Wang, Wei Wu, and Dawei Song.
In EMNLP 2022. [arXiv][poster][code]

dogetickets Doge Tickets: Uncovering Domain-general Language Models by Playing Lottery Tickets.
Yi Yang#, Chen Zhang#, Benyou Wang, and Dawei Song.
In NLPCC 2022, Best Paper Award. [arXiv][slides][code]

Adaptable Text Matching via Meta-Weight Regulator.
Bo Zhang, Chen Zhang, Fang Ma, and Dawei Song.
In SIGIR 2022. [arXiv][paper][slides]

A Simple Baseline for Cross-domain Few-shot Text Classification.
Chen Zhang and Dawei Song.
In NLPCC 2021. [paper][slides][code]

Talks

On Long-context Efficiency at Li Auto. 2024/7/25. [slides]

Democratization of LLMs at ByteDance Research. 2024/8/8. [slides]

Services

Organizer: WSDM CUP 2024.

Reviewer: ARR, ACL, EMNLP, NAACL, SIGIR, CIKM, AAAI.

Secondary Reviewer: WSDM, ICTIR, TOIS.

Volunteer: EMNLP.

Honors & Awards

Outstanding Paper Award at ACL. 2025.

Best Paper Award at NLPCC. 2022.

Elite Ph.D. Student at BIT. 2021.

XIAOMI Scholarship. 2021.

Excellent Undergraduate & Graduation Thesis at BIT. 2019.

SIGIR Student Travel Grant by SIGIR. 2019.

Excellent Prize in International Collegiate Competition for Brain-inspired Computing. 2018.

Quite A Few Medals from Chinese (CCGC) and International (TAAI, ICGA) Computer Games Competition. 2017, 2018, 2019.

Third Prize at China University ROBOt COmpetitioN (ROBOCON), member of Robot Team DreamChaser at BIT. 2018.

First Prize at China Undergraduate Mathematical Contest in Modeling, Beijing Division. 2016.