分类 ML 中的文章

Binarygan源码分析

2021年1月11日

https://github.com/salu133445/binarygan 数据集使用sharedarray生成数据集，通过 1 python3 ./training_data/load_mnist_to_sa.py ./training_data/mnist/ --merge --binary 其实是将文件存放在了/dev/shm里，默认文件为/dev/shm/_binarized_mnist_x config开发模式配置项都存放在config.py中了。版本升级tf1.0->2.0 将import tensorflow as tf……

阅读全文

PRML笔记1

2014年10月15日

| ML

概念训练集(training set) 测试集(test set) 检验集(validation set) 目标向量(target vector)，由目标变量组成泛化能力(generalization) 预处理(preprocessing)，也叫特征提取(feature extraction) 分类根据目标向量的有无，我们分为监督学习(……

阅读全文

多臂赌博机3

2014年10月11日

| ML

在上两节我们讨论的UCB系列算法面对的情况是静态的,即各臂的分布参数不会改变,于是我们就"乐观地面对不确定性"–根据采样平均值尽快地确定那个最好的臂. 但是在现实世界中收益结构是更复杂的,非静态的.特别是当它涉及到竞争的场景,如股票交易.我们称之为对……

阅读全文

多臂赌博机2

2014年10月10日

| ML

这一节我们来了解下多臂赌博机问题的提出和理论基础,最后讨论下UCB系列策略.当然,这里的多臂赌博机问题是随机式的. 随机式多臂赌博机的问题描述就不在这里重复了,可以参考上一节理论问题的证明 Lai & Robbins在1985年论证了对于某些特定的分布(只有一个实参的分布),存在有策略使得它……

阅读全文

多臂赌博机1

2014年10月4日

| ML

假想一个风投他想着他的收益最大化,这时他总会面临一个两难: 何时去投资那些已经成功的公司,何时去投资那些还没有成功但具有很大潜力的公司.这里套用股市里的一句话:收益总是伴随着风险的. 一个成功的风投必须处理好这个勘探-开发两难(exploration and exploitation tradeoff): 勘探过多意味着不能获得较高的……

阅读全文

pymaBandit说明

2014年9月2日

| ML

分布 Bernoulli distribution Poisson distribution Exponential distribution 策略 Gittin’s Bayesian optimal strategy for binary rewards [1] The classical UCB policy [2] The UCB-V policy [3] The KL-UCB policy [4] The Clopper-Pearson policy for binary rewards [4] The MOSS policy [5] The DMED policy [6] The Emipirical Likelihood UCB [7] The Bayes-UCB policy [8] The Thompson sampling policy [9] [1] Bandit Processes and Dynamic Allocation Indices J. C. Gittins. Journal of the Royal Statistical Society. Series B (Methodological) Vol. 41, No. 2. 1979 pp. 148–177 [2] Finite-time analysis of the multiarmed bandit problem Peter Auer, Nicolò Cesa-Bianchi and Paul Fischer. Machine Learning 47 2002 pp.235-256 [3] Exploration-exploitation trade-off using variance estimates in multi-armed bandits J.-Y. Audibert, R. Munos, Cs. Szepesvár……

阅读全文

分类 ML 中的文章

Binarygan源码分析

PRML笔记1

多臂赌博机3

多臂赌博机2

多臂赌博机1

pymaBandit说明

最近文章

分类

标签

其它