91一级特黄大片|婷婷中文字幕在线|av成人无码国产|日韩无码一二三区|久久不射强奸视频|九九九久久久精品|国产免费浮力限制

Our paper got accepted by NIPS'16
來(lái)源: 吳華森/
加州大學(xué)戴維斯分校
1739
2
0
2016-08-13

Our paper "Double Thompson Sampling for Dueling Bandits" got accepted by NIPS'16, one of the top conferences in machine learning. 

In this paper, we propose a Double Thompson Sampling (D-TS) algorithm for dueling bandit problems. As indicated by its name, D-TS selects both the first and the second candidates according to Thompson Sampling. Specifically, D-TS maintains a posterior distribution for the preference matrix, and chooses the pair of arms for comparison by sampling twice from the posterior distribution. This simple algorithm applies to general Copeland dueling bandits, including Condorcet dueling bandits as its special case. For general Copeland dueling bandits, we show that D-TS achieves O(K^2 log T) regret. For Condorcet dueling bandits, we further simplify the D-TS algorithm and show that the simplified D-TS algorithm achieves O(Klog T + K^2 log log T) regret. Simulation results based on both synthetic and real-world data demonstrate the efficiency of the proposed D-TS algorithm.


A preliminary version can be found at https://arxiv.org/abs/1604.07101.


登錄用戶(hù)可以查看和發(fā)表評(píng)論, 請(qǐng)前往  登錄 或  注冊(cè)。
SCHOLAT.com 學(xué)者網(wǎng)
免責(zé)聲明 | 關(guān)于我們 | 聯(lián)系我們
聯(lián)系我們: