Hi, there! I'm Zhe Wang, a first year M.S. student at University of Illinois at Urbana-Champaign, majoring in Computer Science. I am advised by Prof. Lingming Zhang. Before coming to UIUC, I completed my undergraduate study at Tsinghua University, majoring in Mathematics and Physics.
My research interest lies in the intersection of Artificial Intelligence and Software Engineering. More specifically:
I am currently seeking for an Ph.D. position starting from Fall 2026. Feel free to drop me an email if you are interested in my research or have any questions. Please find my detailed Curriculum Vitae below:
") does not match the recommended repository name for your site ("
").
", so that your site can be accessed directly at "http://
".
However, if the current repository name is intended, you can ignore this message by removing "{% include widgets/debug_repo_name.html %}
" in index.html
.
",
which does not match the baseurl
("
") configured in _config.yml
.
baseurl
in _config.yml
to "
".
Jiawei Liu*, Nirav Diwan*, Zhe Wang*, Haoyu Zhai, Xiaona Zhou, Kiet A. Nguyen, Tianjiao Yu, Muntasir Wahed, Yinlin Deng, Hadjer Benkraouda, Yuxiang Wei, Lingming Zhang, Ismini Lourentzou, Gang Wang (* equal contribution)
🥇 1st Place in Amazon Nova AI Challenge 2025 ($250,000)
We introduce PurpCode, the first post-training recipe for training safe code reasoning models towards generating secure code and defending against malicious cyberactivities. PurpCode trains a reasoning model in two stages: (i) Rule Learning, which explicitly teaches the model to reference cybersafety rules to generate vulnerability-free code and to avoid facilitating malicious cyberactivities; and (ii) Reinforcement Learning, which optimizes model safety and preserves model utility through diverse, multi-objective reward mechanisms.
Jiawei Liu*, Nirav Diwan*, Zhe Wang*, Haoyu Zhai, Xiaona Zhou, Kiet A. Nguyen, Tianjiao Yu, Muntasir Wahed, Yinlin Deng, Hadjer Benkraouda, Yuxiang Wei, Lingming Zhang, Ismini Lourentzou, Gang Wang (* equal contribution)
🥇 1st Place in Amazon Nova AI Challenge 2025 ($250,000)
We introduce PurpCode, the first post-training recipe for training safe code reasoning models towards generating secure code and defending against malicious cyberactivities. PurpCode trains a reasoning model in two stages: (i) Rule Learning, which explicitly teaches the model to reference cybersafety rules to generate vulnerability-free code and to avoid facilitating malicious cyberactivities; and (ii) Reinforcement Learning, which optimizes model safety and preserves model utility through diverse, multi-objective reward mechanisms.
Kunlun Zhu†, Hongyi Du†, Zhaochen Hong†, Xiaocheng Yang†, Shuyi Guo†, Zhe Wang†, Zhenhailong Wang, Cheng Qian, Xiangru Tang, Heng Ji, Jiaxuan You (†core contributors)
ACL 2025 Main
In this paper, we introduce MultiAgentBench, a comprehensive benchmark designed to evaluate LLM-based multi-agent systems across diverse, interactive scenarios. Our framework measures not only task completion but also the quality of collaboration and competition using novel, milestone-based key performance indicators.
Kunlun Zhu†, Hongyi Du†, Zhaochen Hong†, Xiaocheng Yang†, Shuyi Guo†, Zhe Wang†, Zhenhailong Wang, Cheng Qian, Xiangru Tang, Heng Ji, Jiaxuan You (†core contributors)
ACL 2025 Main
In this paper, we introduce MultiAgentBench, a comprehensive benchmark designed to evaluate LLM-based multi-agent systems across diverse, interactive scenarios. Our framework measures not only task completion but also the quality of collaboration and competition using novel, milestone-based key performance indicators.
Yuxiang Wei, Zhe Wang, Jiawei Liu, Yifeng Ding, Lingming Zhang
ICML 2024
In this paper, we introduce Magicoder, a series of fully open-source (code, weights, and data) Large Language Models (LLMs) for code that significantly closes the gap with top code models while having no more than 7B parameters.
Yuxiang Wei, Zhe Wang, Jiawei Liu, Yifeng Ding, Lingming Zhang
ICML 2024
In this paper, we introduce Magicoder, a series of fully open-source (code, weights, and data) Large Language Models (LLMs) for code that significantly closes the gap with top code models while having no more than 7B parameters.