Publications
See an updated list of publications in my Google scholar page. But here you can find a categorized list of them.
Alignment
SEAL: Safety-enhanced Aligned LLM Fine-tuning via Bilevel Data Selection
Han Shen, Pin-Yu Chen, Payel Das, Tianyi Chen
ICLR 2025. [arxiv]Fundamental Safety-Capability Trade-offs in Fine-tuning Large Language Models
Pin-Yu Chen*, Han Shen*, Payel Das, Tianyi Chen
*equal contribution, preprint. [arxiv]Mitigating Forgetting in LLM Supervised Fine-Tuning and Preference Learning
Heshan Fernando*, Han Shen*, Parikshit Ram, Yi Zhou, Horst Samulowitz, Nathalie Baracaldo, Tianyi Chen
*equal contribution, preprint. [arxiv]
Optimization
On Penalty-based Bilevel Gradient Descent Method
Han Shen, Quan Xiao, Tianyi Chen
ICML 2023, extended work in Mathematical Programming. [arxiv].Mitigating Gradient Bias in Multi-objective Learning: A Provably Convergent Approach
Heshan D. Fernando, Han Shen, Miao Liu, Subhajit Chaudhury, Keerthiram Murugesan, Tianyi Chen
ICLR 2023 (oral). [arxiv]Joint Unsupervised and Supervised Training for Automatic Speech Recognition via Bilevel Optimization
A.F.M. Saif, Xiaodong Cui, Han Shen, Songtao Lu, Brian Kingsbury, Tianyi Chen
ICASSP 2024.A Method For Bilevel Optimization With Convex Lower-level Problem
Han Shen, Santiago Paternain, Gaowen Liu, Ramana Kompella, Tianyi Chen
ICASSP 2024.Alternating projected SGD for equality-constrained bilevel optimization
Quan Xiao, Han Shen, Wotao Yin, Tianyi Chen
AISTATS 2023. [arxiv]A Single-timescale Analysis for Stochastic Approximation with Multiple Coupled Sequences
Han Shen, Tianyi Chen
NeurIPS 2022 (oral). [arxiv]
Reinforcement Learning
On Entropy Control in LLM-RL Algorithms
Han Shen
ICLR 2026 [arxiv].Principled Penalty-based Methods for Bilevel Reinforcement Learning and RLHF
Han Shen, Zhuoran Yang, Tianyi Chen
ICML 2024, extended work in JMLR [arxiv].Towards Understanding Asynchronous Advantage Actor-critic: Convergence and Linear Speedup
Han Shen, Kaiqing Zhang, Mingyi Hong, Tianyi Chen
IEEE Transactions on Signal Processing. [arxiv]Adaptive Temporal Difference Learning with Linear Function Approximation
Tao Sun, Han Shen, Tianyi Chen, Dongsheng Li
TPAMI. [arxiv]Byzantine-resilient Decentralized Policy Evaluation with Linear Function Approximation
Zhaoxian Wu, Han Shen, Tianyi Chen, Qing Ling
IEEE Transactions on Signal Processing. [arxiv]Distributed Offline Policy Optimization Over Batch Data
Han Shen, Songtao Lu, Xiaodong Cui, Tianyi Chen
AISTATS 2022. [paper]
