AdaBoost简介及Python应用

文章目录

AdaBoost

AdaBoost集合了多个多分类器，采用加权多数表决的方法，加大分类误差率小的弱分类器的权重，减小分类误差率大的弱分类器的权重。

AdaBoost的一般流程

（1）收集数据：可以适用任意方法
（2）准备数据：依赖于所使用的弱分类器类型
（3）分析数据：可以使用任意方法
（4）训练算法：AdaBoost的大部分时间都用在训练上，分类器将多次在同一数据集上训练弱分类器
（5）测试算法：计算分类的错误率
（6）使用算法：同SVM一样，AdaBoost预测两个类别中的一个。

算法流程

训练样本（李航统计学习方法）

需要用到的若分类器为x < v或 x > v

序号	1	2	3	4	5	6	7	8	9	10
x	0	1	2	3	4	5	6	7	8	9
y	1	1	1	-1	-1	-1	1	1	1	-1

Python实现

Python import numpy as np class Classifier: def __init__(self): self.v = 0 self.left = 0 # 该分类器有两种情况，一种是x > v为1，另一种是x < v为1 def train(self, x, y, w): min_loss = 1 for v in np.arange(0.5, 10, 1): for left in [0, 1]: if left == 0: label = (((x < v) - 0.5) * 2 != y) # (0 - 0.5)*2 = -1 else: label = (((x > v) - 0.5) * 2 != y) loss = sum(label * w) if loss < min_loss: min_loss = loss self.v = v self.left = left return min_loss def test(self, x): if self.left == 0: return ((x < self.v) - 0.5)*2 else: return ((x > self.v) - 0.5)*2 class AdaBoost: def __init__(self, classifier=Classifier): self.classifier = classifier self.pool = [] # 分类器集合 self.alphas = [] def train(self, x, y): num = x.shape[0] M = 3 W = np.array([1 / num] * num) print(' Weight:', W) for m in range(M): classifier_m = self.classifier() e_m = classifier_m.train(x, y, W) # 计算错误率 print('GEN:', m, 'Error:', e_m, '\n') alpha_m = 1 / 2 * np.log((1-e_m)/e_m) # 分类器m的系数 G_m = classifier_m.test(x) W = W * np.exp(- alpha_m * y * G_m) # 更新权重 W = W / np.sum(W) # 规范化 print(' Weight:', W) self.pool.append(classifier_m) self.alphas.append(alpha_m) # 系数存储 def test(self, x): n = x.shape[0] results = np.zeros(n) for alpha, classifier in zip(self.alphas, self.pool): results += alpha * classifier.test(x) return ((results > 0) - 0.5) * 2 x = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] y = [1, 1, 1, -1, -1, -1, 1, 1, 1, -1] x = np.array(x) y = np.array(y) ab = AdaBoost() ab.train(x, y) print('TEST: ', ab.test(x)) 1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768 import numpy as np  class Classifier:    def __init__(self):        self.v = 0        self.left = 0  # 该分类器有两种情况，一种是x > v为1，另一种是x < v为1     def train(self, x, y, w):        min_loss = 1        for v in np.arange(0.5, 10, 1):            for left in [0, 1]:                if left == 0:                    label = (((x < v) - 0.5) * 2 != y)  # (0 - 0.5)*2 = -1                else:                    label = (((x > v) - 0.5) * 2 != y)                loss = sum(label * w)                if loss < min_loss:                    min_loss = loss                    self.v = v                    self.left = left        return min_loss     def test(self, x):        if self.left == 0:            return ((x < self.v) - 0.5)*2        else:            return ((x > self.v) - 0.5)*2  class AdaBoost:    def __init__(self, classifier=Classifier):        self.classifier = classifier        self.pool = []  # 分类器集合        self.alphas = []     def train(self, x, y):        num = x.shape[0]        M = 3        W = np.array([1 / num] * num)        print('  Weight:', W)        for m in range(M):            classifier_m = self.classifier()            e_m = classifier_m.train(x, y, W)  # 计算错误率            print('GEN:', m, 'Error:', e_m, '\n')            alpha_m = 1 / 2 * np.log((1-e_m)/e_m)  # 分类器m的系数            G_m = classifier_m.test(x)            W = W * np.exp(- alpha_m * y * G_m)  # 更新权重            W = W / np.sum(W)  # 规范化            print('  Weight:', W)            self.pool.append(classifier_m)            self.alphas.append(alpha_m)  # 系数存储     def test(self, x):        n = x.shape[0]        results = np.zeros(n)        for alpha, classifier in zip(self.alphas, self.pool):            results += alpha * classifier.test(x)        return ((results > 0) - 0.5) * 2  x = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]y = [1, 1, 1, -1, -1, -1, 1, 1, 1, -1]x = np.array(x)y = np.array(y)ab = AdaBoost()ab.train(x, y)print('TEST: ', ab.test(x))

结果

参考

https://blog.csdn.net/u013859301/article/details/79483126

AdaBoost简介及Python应用

文章目录

AdaBoost

AdaBoost的一般流程

算法流程

训练样本（李航统计学习方法）

Python实现

结果

参考

大模型AlpacaFarm分析

NLG文本评估任务或许并不需要真值或参考文本

大模型中的RepE表征工程

大模型也是一种优化器（LLM as Optimizer）

全栈开发与快速部署Demo

学术idea自动发现与生成

自回归语言模型（language model）Python实现

粉丝期待的三体电影宇宙（近四十部电影与电视剧集）

基于历史对比学习的时序知识图谱推理

泰拉瑞亚Terriaria快速部署Linux服务器

留下评论取消回复

文章目录

AdaBoost

AdaBoost的一般流程

算法流程

训练样本（李航 统计学习方法）

Python实现

结果

参考

相关文章

留下评论取消回复

训练样本（李航统计学习方法）