Publications

See an updated list of publications in my Google scholar page. But here you can find a categorized list of them.

Alignment

  • SEAL: Safety-enhanced Aligned LLM Fine-tuning via Bilevel Data Selection
    Han Shen, Pin-Yu Chen, Payel Das, Tianyi Chen
    ICLR 2025. [arxiv]

  • Fundamental Safety-Capability Trade-offs in Fine-tuning Large Language Models
    Pin-Yu Chen*, Han Shen*, Payel Das, Tianyi Chen
    *equal contribution, preprint. [arxiv]

  • Mitigating Forgetting in LLM Supervised Fine-Tuning and Preference Learning
    Heshan Fernando*, Han Shen*, Parikshit Ram, Yi Zhou, Horst Samulowitz, Nathalie Baracaldo, Tianyi Chen
    *equal contribution, preprint. [arxiv]

Optimization

  • On Penalty-based Bilevel Gradient Descent Method
    Han Shen, Quan Xiao, Tianyi Chen
    ICML 2023, extended work in Mathematical Programming. [arxiv].

  • Mitigating Gradient Bias in Multi-objective Learning: A Provably Convergent Approach
    Heshan D. Fernando, Han Shen, Miao Liu, Subhajit Chaudhury, Keerthiram Murugesan, Tianyi Chen
    ICLR 2023 (oral). [arxiv]

  • Joint Unsupervised and Supervised Training for Automatic Speech Recognition via Bilevel Optimization
    A.F.M. Saif, Xiaodong Cui, Han Shen, Songtao Lu, Brian Kingsbury, Tianyi Chen
    ICASSP 2024.

  • A Method For Bilevel Optimization With Convex Lower-level Problem
    Han Shen, Santiago Paternain, Gaowen Liu, Ramana Kompella, Tianyi Chen
    ICASSP 2024.

  • Alternating projected SGD for equality-constrained bilevel optimization
    Quan Xiao, Han Shen, Wotao Yin, Tianyi Chen
    AISTATS 2023. [arxiv]

  • A Single-timescale Analysis for Stochastic Approximation with Multiple Coupled Sequences
    Han Shen, Tianyi Chen
    NeurIPS 2022 (oral). [arxiv]

Reinforcement Learning

  • On Entropy Control in LLM-RL Algorithms
    Han Shen
    ICLR 2026 [arxiv].

  • Principled Penalty-based Methods for Bilevel Reinforcement Learning and RLHF
    Han Shen, Zhuoran Yang, Tianyi Chen
    ICML 2024, extended work in JMLR [arxiv].

  • Towards Understanding Asynchronous Advantage Actor-critic: Convergence and Linear Speedup
    Han Shen, Kaiqing Zhang, Mingyi Hong, Tianyi Chen
    IEEE Transactions on Signal Processing. [arxiv]

  • Adaptive Temporal Difference Learning with Linear Function Approximation
    Tao Sun, Han Shen, Tianyi Chen, Dongsheng Li
    TPAMI. [arxiv]

  • Byzantine-resilient Decentralized Policy Evaluation with Linear Function Approximation
    Zhaoxian Wu, Han Shen, Tianyi Chen, Qing Ling
    IEEE Transactions on Signal Processing. [arxiv]

  • Distributed Offline Policy Optimization Over Batch Data
    Han Shen, Songtao Lu, Xiaodong Cui, Tianyi Chen
    AISTATS 2022. [paper]