Beta Distribution_动视_懂你更懂生活

Beta Distribution

2025-09-28 00:35:50 责编:小OO

点击下载本文 文档为doc格式

Beta Distribution

Paul Johnsonand Matt Beverlin

June10,2013

How likely is it that

In my view,the

thinks the chances are

People disagree and

0.4or0.5or whatnot.

The Beta density

or probabilities.It is

which work together to

interval and whether it is symmetrical.

The Beta can be used to describe not only the variety observed across people,but it can also describe your subjective degree of belief(in a Bayesian sense).If you are not entirely sure that the probability is0.22,but rather you think that is the most likely value but that there is some chance that the value is higher or lower,then maybe your personal beliefs can be described as a Beta distribution.

2Mathematical Deﬁnition

The standard Beta distribution gives the probability density of a value x on the interval (0,1):

Beta(α,β):prob(x|α,β)=xα−1(1−x)β−1

B(α,β)

(1)

where B is the beta function

B(α,β)= 1

tα−1(1−t)β−1dt

2.1Don’t let all of those betas confuse you.

It is disappointingly confusing,but the word“beta”is used for3completely diﬀerent mean-ings.

1.Beta(α,β)“Beta”is the name of the probability distribution2.B(α,β)“Beta”is the name of a function that appears in the denominator of the density

function

3.β“Beta”is the name of the second parameter in the density function

2.2About the Beta function B

The Beta function B in the denominator plays the role of a“normalizing constant”which assures that the total area under the density curve equals1.

The Beta function is equal to a ratio of Gamma functions:

B(α,β)=Γ(α)Γ(β)Γ(α+β)

Keeping in mind that for integers,Γ(k)=(k−1)!,one can do some checking and get an idea of what the shape might be.

A3dimensional graph of the Beta function can be found in Figure1.

3Moments of the Beta

The expected value of a variable that is Beta distributed is:

E(x)=µ=

α+β

(2)

and the variance is

V ariance(x)=

αβ

(α+β)2(α+β+1)

(3)

People who are familiar with the Generalized Linear Model will notice that

V(µ)=

(α+β)(α+β+1)

·µ

is a variance function,V(µ),which indicates the dependence of the observed variance on the mean.For aﬁxed pair of parameters(α,β),the variance is proportional toµ.A graph illustrating the Variance function is presented in Figure2.

The third and fourth moments are:

Skewness(x)=2(β−α)

√

1+α+β

√

α+β(2+α+β)

(4)

Kurtosis(x)=6[α3+α2(1−2β)+β2(1+β)−2αβ(2+β)]

αβ(α+β+2)(α+β+3)

(5)B e ta F u n c ti o n 6 8

2.0

M u lt ip li e r

0.5

0.0

0.20.4

0.60.8 1.0

0.0

0.20.40.60.81.

p r o b a b i l i t y

Figure 3:Beta(1,1)is the Uniform distribution

3.1The Mode

If α>1and β>1,the peak of the density is in the interior of [0,1]and mode of the Beta distribution is

mode =γ=α−1

α+β−2

(6)

If αor β<1,the mode may be at an edge.

As we will illustrate below,if α=β=1,then the Beta is identical to a Uniform distri-bution.

4Illustration

One advantage of the Beta distribution is that it can take on many diﬀerent shapes.If one believed that all scores were equally likely,then one could set the parameters α=1and β=1,as illustrated in Figure 3,this gives a “ﬂat”probability density function.

In models of elections,one may need a distribution of ideal points to resemble a single-peaked distribution on the interval [0,1].The Beta can be very useful in this kind of exercise.Consider Figure 4.

At one point,it fascinated me that the mode did not equal the mean and that the variance ends up characterizing the “slack”between those two things.Various densities in Figure 5might be entertaining.In these examples,the Beta parameters are chosen to keep the mode constant at 0.3.Note how the mean and variance change across the illustrations.

2040

6080100

0.0

0.51.01.52.0

5Beta( 3 , 5.67 )

p r o b a b i l i t y d e n s i t y

2040

6080100

0.0

0.51.01.52.0

5Beta( 3 , 3 )

p r o b a b i l i t y d e n s i t y

2040

6080100

0.0

0.51.01.52.0

5Beta( 5.67 , 3 )

p r o b a b i l i t y d e n s i t y

Figure 5:Beta Distributions with Mode=0.3

100

0.01.5

Beta(1.1, 1.23) mode=0.3 mean=0.47, var=0.075

ideal point d e n s i t y

020*********

0.0

1.5

0Beta(1.76, 2.76) mode=0.3 mean=0.39, var=0.043

ideal point

d e n s i t y

100

0.01.5

Beta(2.41, 4.29) mode=0.3 mean=0.36, var=0.03

ideal point d e n s i t y

020*********

0.01.5

0Beta(3.07, 5.82) mode=0.3 mean=0.34, var=0.023

ideal point

d e n s i t y

100

0.01.5

Beta(3.72, 7.35) mode=0.3 mean=0.34, var=0.018

ideal point d e n s i t y

020*********

0.01.5

0Beta(4.38, 8.88) mode=0.3 mean=0.33, var=0.016

ideal point

d e n s i t y

100

0.01.5

Beta(5.03, 10.41) mode=0.3 mean=0.33, var=0.013

ideal point d e n s i t y

020*********

0.01.5

0Beta(5.69, 11.94) mode=0.3 mean=0.32, var=0.012

ideal point

d e n s i t y

100

0.0

1.53.

Beta(6.34, 13.47) mode=0.3 mean=0.32, var=0.01

ideal point d e n s i t y

20406080100

0.0

1.5

Beta(7, 15) mode=0.3 mean=0.32, var=0.009

ideal point

d e n s i t y

5About the connection between the mean,the mode, and the variance

In the pictures displaying the Beta density,one’s eye is drawn to the peak of the frequency distribution,which is the mode.We can set the Beta’s parameters in order to generate a distribution with a desired mode.Let the mode be represented byγ.

Here’s a simple starting point:Suppose the mode is.50.That is the same as the mean (its symmetric),and the mode formula(6)implies:

.50=

α−1

α+β−2

(7)

and

.50α+.50β−1=α−1

.5α=.5β

α=β(8) If one wants the mode to be in the middle,one can choose any value forα,as long as one chooses the same value forβ.(Whew!What a relief.This exactly matched my intuition.) If the mode is in the center,we knowαandβare equal,but we don’t know their values.The selection,it turns out,depends on how much diversity there is.If one wants a distribution to have points“tightly bunched”around the mode,then one should choose a large value forα,say10.0,

varianceof Beta(10,10)=0.01190(9) In contrast,ifα=1.5,the variance is much greater:

varianceof Beta(1.5,1.5)=0.0625(10) Seen in this light,the parameterαis a“homogeneity indicator.”Asαgets bigger,the distribution collapses around the mode.

Although this particular calculation works only for a mode in the center,it does outline the process that we can use to assignαandβfor all other values of the mode.

Suppose the mode is.4.From equation6

.40=

α−1α+β−2

0.0

0.2

0.4

0.6

0.8

1.0

0.0

1.02.03.0

Beta(0.7,0.2 mean=0.78 var=0.09x

d e n s i t y

0.0

0.2

0.4

0.6

0.8

1.0

0.0

1.02.03.

Beta(0.7,0.5 mean=0.58 var=0.11x

d e n s i t y

0.0

0.2

0.4

0.6

0.8

1.0

0.0

1.02.03.

Beta(0.7,0.75 mean=0.48 var=0.1x

d e n s i t y

0.0

0.2

0.4

0.6

0.8

1.0

0.0

1.02.03.

Beta(0.7,1.1 mean=0.39 var=0.08x

d e n s i t y

0.0

0.20.4

0.60.8 1.0

0.0

1.02.03.0

Beta(1.2,0.2 mean=0.86 var=0.05

d e n s i t y

0.0

0.20.4

0.60.8 1.0

0.0

1.02.03.0

Beta(1.2,0.5 mean=0.71 var=0.08

d e n s i t y

0.0

0.20.4

0.60.8 1.0

0.0

1.02.03.0

Beta(1.2,0.75 mean=0.62 var=0.08

d e n s i t y

0.0

0.20.4

0.60.8 1.0

0.0

1.02.03.0

Beta(1.2,1.1 mean=0.52 var=0.08

d e n s i t y

Figure 6:Some Unpleasant Betas

β=3

α−

(11)

.60α=.2+.40β

α=1

β(12)

It is quite possible to calculate one parameter as a function of another,after specifying the mode,even if the mode is oﬀcenter.

Generally speaking,for any value of the mode,γ∈(0,1)(keeping in mind the original

stipulation thatα,β>1):

γ=

α−1

α+β−2

(13)

γα+γβ−2γ=α−1(14) (1−γ)α=γβ−2γ+1(15)

α=γβ−2γ+1

(1−γ)

β−2+1

−1)

1−γ

β−

2γ−1

1−γ

(16)

Soαis a linear function ofβ.(Note:2013-10-25;reader notiﬁed me of typographical error in equation16.Sorry!)

And

γβ=α−1−γα+2γ(17)

β=α−γα+2γ−1

α−γ(α−2)−1

(1−γ)

α−

1−2γ

(18)

This indicates that if we begin with the mode,and then take as given eitherαorβ,we can calculate the missing parameter(βorα,as the case may be).As a result,instead of thinking of the Beta’s shape as determined by parametersαandβ,sometimes it is easier to think of it in terms of the mode(most likely value)and the homogeneity.下载本文

显示全文

全部频道