您的瀏覽器不支援JavaScript語法,網站的部份功能在JavaScript沒有啟用的狀態下無法正常使用。

Institute of Information Science, Academia Sinica

Events

Print

Press Ctrl+P to print from browser

Seminar

:::

TIGP (SNHCC) – Bandit Learning: Optimality, Scalability, and Reneging

  • LecturerDr. Ping-Chun Hsieh (Department of Computer Science,National Chiao Tung University)
    Host: TIGP SNHCC Program
  • Time2019-12-25 (Wed.) 14:00 ~ 16:00
  • LocationAuditorium106 at IIS new Building
Abstract

Bandit learning is a classic framework that captures the exploration-exploitation dilemma. Despite the existing variety of bandit algorithms, there is still an unsatisfactory trade-off between regret performance and computational complexity. In this talk, we will present a new family of bandit algorithms, that are formulated in a general way based on the biased maximum likelihood estimation (BMLE) method. We prove that the BMLE algorithm achieves a logarithmic finite-time regret bound and hence attains order-optimality. Through extensive simulations, we demonstrate that the proposed algorithm achieves regret performance comparable to the best of several state-of-the-art baseline methods while having a significant computational advantage in comparison to other best-performing methods. Lastly, we will discuss how bandit learning can be extended to capture reneging risk and heteroscedasticity.