bootstrap置信区间如何求_懂视

bootstrap置信区间如何求

2020-11-27 15:01:19 责编:小采

bootstrap置信区间：

假设总体的分布F未知，但有一个容量为n的来自分布F的数据样本，自这一样本按有放回抽样的方法抽取一个容量为n的样本，这种样本称为bootstrap样本。相继地、地自原始样本中抽取很多个bootstrap样本，利用这些样本对总体F进行统计推断，这种方法称为非参数bootstrap方法，又称自助法。

使用bootstrap方法可以求得变量(参数)的置信区间，称作bootstrap置信区间。

bootstrap置信区间：

使用Python计算bootstrap置信区间：

这里以一维数据为例，取样本均值作为样本估计量。代码如下：

import numpy as np


def average(data):
 return sum(data) / len(data)


def bootstrap(data, B, c, func):
 """
 计算bootstrap置信区间
 :param data: array 保存样本数据
 :param B: 抽样次数 通常B>=1000
 :param c: 置信水平
 :param func: 样本估计量
 :return: bootstrap置信区间上下限
 """
 array = np.array(data)
 n = len(array)
 sample_result_arr = []
 for i in range(B):
 index_arr = np.random.randint(0, n, size=n)
 data_sample = array[index_arr]
 sample_result = func(data_sample)
 sample_result_arr.append(sample_result)

 a = 1 - c
 k1 = int(B * a / 2)
 k2 = int(B * (1 - a / 2))
 auc_sample_arr_sorted = sorted(sample_result_arr)
 lower = auc_sample_arr_sorted[k1]
 higher = auc_sample_arr_sorted[k2]

 return lower, higher


if __name__ == '__main__':
 result = bootstrap(np.random.randint(0, 50, 50), 1000, 0.95, average)
 print(result)

输出：

(20.48, 28.32)

全部频道