About me

I am a senior research engineer with Ant Group, where I currently work on a variety of LLM alignment and RL.

Before this, I worked in IBM Research AI as a research intern and collaborated with Pin-Yu Chen, Payel Das, Songtao Lu, Xiaodong Cui and many other talented researchers at IBM. My research in IBM focused on LLM alignment and offline RL. Meanwhile, I did my Ph.D. under the supervision of Dr. Tianyi Chen (now in Cornell Tech) in RPI. I was fortunate to join Dr. Tianyi Chen’s group as the first Ph.D. student. My Ph.D. research focused on optimization and reinforcement learning.

News and highlights

Selected works

  • SEAL: Safety-enhanced Aligned LLM Fine-tuning via Bilevel Data Selection
    Han Shen, Pin-Yu Chen, Payel Das, Tianyi Chen
    ICLR 2025. [arxiv]

  • Mitigating Forgetting in LLM Supervised Fine-Tuning and Preference Learning
    Heshan Fernando*, Han Shen*, Parikshit Ram, Yi Zhou, Horst Samulowitz, Nathalie Baracaldo, Tianyi Chen
    *equal contribution, preprint. [arxiv]

  • Principled Penalty-based Methods for Bilevel Reinforcement Learning and RLHF
    Han Shen, Zhuoran Yang, Tianyi Chen
    ICML 2024, extended work in JMLR [arxiv]

  • Mitigating Gradient Bias in Multi-objective Learning: A Provably Convergent Approach
    Heshan D. Fernando, Han Shen, Miao Liu, Subhajit Chaudhury, Keerthiram Murugesan, Tianyi Chen
    ICLR 2023 oral. [arxiv]

  • On Penalty-based Bilevel Gradient Descent Method
    Han Shen, Quan Xiao, Tianyi Chen
    ICML 2023, extended work in Mathematical Programming. [MAPR]

Industry experiences

Ant Group. (CN) Present

  • Senior research engineer, joined via Ant Star talent program.

IBM Research AI. (US) 05.2024 - 08.2024

IBM Research AI. (US) 05.2021 - 08.2021

Services

Reviewer for NeurIPS, ICML, ICLR, AISTATS, AAAI and IEEE Transactions on Signal Processing (TSP).