Mickel Liu

Mickel Liu

I am a Ph.D. student at Paul G. Allen School of Computer Science & Engineering, University of Washington, co-advised by Prof. Natasha Jaques and Prof. Luke Zettlemoyer. I recently joined Meta FAIR as a visiting researcher.

I did my Master's at the School of Computer Science, Peking University, mentored by Prof. Yizhou Wang and Prof. Yaodong Yang.

Email / Short CV / Google Scholar / Github / Twitter

Research

My research interest boardly covers the reinforced finetuning of LLMs and reasoning models:

	Chasing Moving Targets with Online Self-Play Reinforcement Learning for Safer Language Models Mickel Liu, Liwei Jiang, Yancheng Liang, Simon Shaolei Du, Yejin Choi, Tim Althoff, Natasha Jaques (equal contribution) Preprint / Code In submission* We used self-play reinforcement learning and hidden Chain-of-Throught to discover more diverse adversarial attacks and to align safer language model.
	BeaverTails: Towards Improved Safety Alignment of LLM via a Human-Preference Dataset Jiaming Ji, Mickel Liu, Juntao Dai, Xuehai Pan, Chi Zhang, Ce Bian, Ruiyang Sun, Yizhou Wang, Yaodong Yang (equal contribution, random ordering) Preprint / Code / Hugging Face NeurIPS 2023 We present the BeaverTails dataset for safety research in large language models. Our findings show that modeling decoupled human preferences for helpfulness and harmlessness improves LLM safety without sacrificing performance.
	Safe RLHF: Safe Reinforcement Learning from Human Feedback Josef Dai, Xuehai Pan, Ruiyang Sun, Jiaming Ji, Xinbo Xu, Mickel Liu, Yizhou Wang, Yaodong Yang (equal contribution) Preprint / Code / Website ICLR 2024 Spotlight* Building on the previous BeaverTails dataset, we introduce an RLHF algorithm with safety constraints. Using the Lagrangian method, Safe RLHF fine-tunes the balance between harmlessness and helpfulness. Our three-round fine-tuning shows better mitigation of harmful responses and improved performance over existing value-aligned algorithms. This work has been graciously promote-tweeted by Ahsen (@_akhaliq).
	Baichuan 2: Open large-scale language models Aiyuan Yang, Bin Xiao, Bingning Wang, Borong Zhang, Chao Yin, Chenxu Lv, Da Pan, Dian Wang, Dong Yan, Fan Yang, Fei Deng, Feng Wang, Feng Liu, Guangwei Ai, Guosheng Dong Haizhou Zhao, Hang Xu, Haoze Sun, Hongda Zhang, Hui Liu, Jiaming Ji, Jian Xie, Juntao Dai, Kun Fang, Lei Su Liang Song, Lifeng Liu, Liyun Ru, Luyao Ma, Mang Wang, Mickel Liu, MingAn Lin, Nuolan Nie, Peidong Guo, Ruiyang Sun, Tao Zhang, Tianpeng Li, Tianyu Li, Wei Cheng, Weipeng Chen, Xiangrong Zeng, Xiaochuan Wang, Xiaoxi Chen, Xin Men, Xin Yu, Xuehai Pan, Yanjun Shen, Yiding Wang, Yiyu Li, Youxin Jiang, Yuchen Gao, Yupeng Zhang, Zenan Zhou, Zhiying Wu (alphabetical ordering) Preprint / Code / Hugging Face / Bloomberg ($1bn valuation) Technical report in public archive During my time at Baichuan, I have participated in the open-sourcing of our LLMs, which were trained from scratch on 2.6 trillion tokens, and credited as an author to in the corresponding technical report. Baichuan 2 matches or exceeds the performance of other open-source models in its class across various public benchmarks, including MMLU, CMMLU, GSM8K, and HumanEval. Notably, Baichuan2-13B-Chat achieved the highest score on the SuperCLUE-agent benchmark among all open-sourced models.
	Proactive Multi-Camera Collaboration For 3D Human Pose Estimation Hai Ci, Mickel Liu, Xuehai Pan, Fangwei Zhong, Yizhou Wang (equal contribution) Proceeding / Code / Website ICLR 2023 Active3DPose Presents a multi-agent reinforcement learning (MARL) scheme for proactive Multi-Camera Collaboration in 3D Human Pose Estimation in dynamic human crowds. Aerial cameras are decentralized, self-interest agents, but need to collaborate to complete the given task. We proposed a reward structure inspired by the solution concept of Shapley value that helps facilitating the process of collaboration. The simulation environment is built using UnrealEngine and we used distributive RL framework Ray RLlib to train our agents.
	MATE: Benchmarking multi-agent reinforcement learning in distributed target coverage control Xuehai Pan, Mickel Liu, Fangwei Zhong, Yaodong Yang, Song-Chun Zhu, Yizhou Wang Proceeding / Code / Doc NeurIPS 2022 We introduce the Multi-Agent Tracking Environment (MATE), a novel multi-agent environment simulates the target coverage control problems in the real world. MATE hosts an asymmetric cooperative-competitive game consisting of two groups of learning agents — "cameras" and "targets" — with opposing interests. This process of co-evolution between cameras and targets helps to realize a less exploitable camera network.
	OmniSafe: An Infrastructure for Accelerating Safe Reinforcement Learning Research Jiaming Ji, Jiayi Zhou, Borong Zhang, Juntao Dai, Xuehai Pan, Ruiyang Sun, Weidong Huang, Yiran Geng, Mickel Liu, Yaodong Yang (core developers) Preprint / Code / Website JMLR 2024 We introduce a framework designed to expedite SafeRL research endeavors. Our framework encompasses an collection of algorithms spanning different RL domains and places heavy emphasis on safety elements.

Miscellanea

Recipient of Paul G. Allen School CSE Fellowship

Conference Reviewer: ICML 2023, NeurIPS 2022-24, ICLR 2023-25, AAAI 2024

Recipient of CSC Fellowship from 2020 to 2023

This homepage took inspirations (and codes) from Jon Barron's, Leonid Keselman's and Haorang Geng's website.