【模式识别】Fisher线性判别详细推导
思想Fisher的思想是选出一个投影方向,将高维数据投影到低维,投影后的两类相隔尽可能远步骤1、假设有两个样本数据W1=x10,x11,x12,...x1nW_1={x_1^0,x_1^1,x_1^2,...x_1^n}W1=x10,x11,x12,...x1n和W2=x20,x21,x22,...x2mW_2={x_2^0,x_2^1,x_2^2,...x_2^m}W2=x20...
思想
Fisher的思想是选出一个投影方向,将高维数据投影到低维,投影后的两类相隔尽可能远
决策方向
1、假设有两个样本数据
W
1
=
x
1
0
,
x
1
1
,
x
1
2
,
.
.
.
x
1
n
W_1={x_1^0,x_1^1,x_1^2,...x_1^n}
W1=x10,x11,x12,...x1n和
W
2
=
x
2
0
,
x
2
1
,
x
2
2
,
.
.
.
x
2
m
W_2={x_2^0,x_2^1,x_2^2,...x_2^m}
W2=x20,x21,x22,...x2m
第一个样本的类内平均值为
m
1
=
1
n
∑
i
=
0
n
x
i
m_1=\frac{1}{n}\sum_{i=0}^n x_i
m1=n1i=0∑nxi
第二个样本的类内平均值
m
2
=
1
m
∑
i
=
0
m
x
i
m_2=\frac{1}{m}\sum_{i=0}^m x_i
m2=m1i=0∑mxi
样本总平均值
m
=
1
m
+
n
(
∑
i
=
0
n
x
1
i
+
∑
j
=
0
m
x
2
j
)
m=\frac{1}{m+n}(\sum_{i=0}^n x_1^i+\sum_{j=0}^m x_2^j)
m=m+n1(i=0∑nx1i+j=0∑mx2j)
第一个样本类内离散度矩阵为
S
w
1
=
(
x
1
i
−
m
1
)
(
x
1
i
−
m
1
)
T
S_w^1=(x_1^i-m_1)(x_1^i-m_1)^T
Sw1=(x1i−m1)(x1i−m1)T
第二个样本类内离散度矩阵为
S
w
2
=
(
x
2
i
−
m
2
)
(
x
2
i
−
m
1
)
T
S_w^2=(x_2^i-m_2)(x_2^i-m_1)^T
Sw2=(x2i−m2)(x2i−m1)T
样本总的类内离散度矩阵为
S
w
=
S
w
1
+
S
w
2
=
∑
i
=
1
c
l
a
s
s
∑
j
=
0
m
i
(
x
i
j
−
m
i
)
(
x
i
j
−
m
i
)
T
S_w=S_w^1+S_w^2=\sum_{i=1}^{class} \sum_{j=0}^{m_i}(x_i^j-m_i)(x_i^j-m_i)^T
Sw=Sw1+Sw2=i=1∑classj=0∑mi(xij−mi)(xij−mi)T
类间离散度矩阵为
S
b
=
(
m
1
−
m
2
)
(
m
1
−
m
2
)
T
S_b=(m_1-m_2)(m_1-m_2)^T
Sb=(m1−m2)(m1−m2)T
投影到1维空间
y
=
w
T
x
y=w^Tx
y=wTx
线性变换矩阵为
w
w
w,则
S
w
S_w
Sw变为
J
=
w
T
S
b
w
w
T
S
w
w
J=\frac{w^TS_bw}{w^TS_ww}
J=wTSwwwTSbw
这个
J
J
J是判别函数,
J
J
J越大,两类分的概率越好。目的是为了求使
J
J
J变大的
w
w
w矩阵。由于现在分子分母都是变化的,
w
w
w的幅值变换不影响
J
J
J的值,设
w
T
S
w
w
=
c
w^TS_ww=c
wTSww=c,
c
c
c为常数。则
J
J
J变成了如下形式
m
a
x
(
J
)
=
w
T
S
b
w
max(J)=w^TS_bw
max(J)=wTSbw
s
.
t
:
w
T
S
w
w
=
c
s.t :w^TS_ww=c
s.t:wTSww=c
用拉格朗日乘子法
L
(
w
,
λ
)
=
w
T
S
b
w
−
λ
(
w
T
S
w
w
−
c
)
L(w,\lambda)=w^TS_bw-\lambda(w^TS_ww-c)
L(w,λ)=wTSbw−λ(wTSww−c)
∂
L
(
w
,
λ
)
∂
w
=
2
w
T
S
b
−
2
λ
w
T
S
w
=
0
\frac{∂L(w,\lambda)}{∂w}=2w^TS_b-2\lambda w^TS_w=0
∂w∂L(w,λ)=2wTSb−2λwTSw=0
即
w
T
S
b
=
λ
w
T
S
w
w^TS_b=\lambda w^TS_w
wTSb=λwTSw
若S_w是非奇异的,说明S_w可逆
S
w
−
1
S
b
w
T
=
λ
w
T
S_w^{-1}S_b w^T=\lambda w^T
Sw−1SbwT=λwT
说明
w
T
w^T
wT是矩阵
S
w
−
1
S
b
S_w^{-1}S_b
Sw−1Sb属于特征值
λ
\lambda
λ的特征向量。由于
S
b
=
(
m
1
−
m
2
)
(
m
1
−
m
2
)
T
S_b=(m_1-m_2)(m_1-m_2)^T
Sb=(m1−m2)(m1−m2)T
则
S
w
−
1
(
m
1
−
m
2
)
(
m
1
−
m
2
)
T
w
T
=
λ
w
T
S_w^{-1}(m_1-m_2)(m_1-m_2)^T w^T=\lambda w^T
Sw−1(m1−m2)(m1−m2)TwT=λwT
式中
(
m
1
−
m
2
)
T
w
T
(m_1-m_2)^T w^T
(m1−m2)TwT
是个标量,不影响
w
w
w方向,所以最终取
w
T
=
S
w
−
1
(
m
1
−
m
2
)
w^T=S_w^{-1}(m_1-m_2)
wT=Sw−1(m1−m2)
决策面
g
(
x
)
=
w
T
x
+
w
0
g(x)=w^Tx+w_0
g(x)=wTx+w0
如果不考虑先验概率,可以取
w
0
=
−
1
2
(
m
1
+
m
2
)
=
−
m
0
w_0=-\frac{1}{2}(m_1+m_2)=-m_0
w0=−21(m1+m2)=−m0
m
0
m_0
m0是投影后所有样本均值。
若考虑先验概率,加入贝叶斯信息
w
0
=
−
1
2
(
m
1
+
m
2
)
T
Σ
−
1
(
m
1
−
m
2
)
−
P
(
w
2
)
P
(
w
1
)
w_0=-\frac{1}{2}(m_1+m_2)^T\Sigma^{-1}(m_1-m_2)-\frac{P(w_2)}{P(w_1)}
w0=−21(m1+m2)TΣ−1(m1−m2)−P(w1)P(w2)
类内距离看做协方差的话,那么把
Σ
−
1
\Sigma^{-1}
Σ−1换成
S
w
S_w
Sw就变成
w
0
=
−
1
2
(
m
1
+
m
2
)
T
S
w
−
1
(
m
1
−
m
2
)
−
P
(
w
2
)
P
(
w
1
)
w_0=-\frac{1}{2}(m_1+m_2)^TS_w^{-1}(m_1-m_2)-\frac{P(w_2)}{P(w_1)}
w0=−21(m1+m2)TSw−1(m1−m2)−P(w1)P(w2)
决策面变为
g
(
x
)
=
w
T
x
+
w
0
=
w
T
x
−
1
2
(
m
1
+
m
2
)
T
S
w
−
1
(
m
1
−
m
2
)
−
P
(
w
2
)
P
(
w
1
)
g(x)=w^Tx+w_0=w^Tx-\frac{1}{2}(m_1+m_2)^TS_w^{-1}(m_1-m_2)-\frac{P(w_2)}{P(w_1)}
g(x)=wTx+w0=wTx−21(m1+m2)TSw−1(m1−m2)−P(w1)P(w2)
其中
S
w
−
1
(
m
1
−
m
2
)
=
w
T
S_w^{-1}(m_1-m_2)=w^T
Sw−1(m1−m2)=wT
带入
g
(
x
)
=
w
T
x
+
w
0
=
w
T
x
−
1
2
(
m
1
+
m
2
)
T
w
T
−
P
(
w
2
)
P
(
w
1
)
g(x)=w^Tx+w_0=w^Tx-\frac{1}{2}(m_1+m_2)^Tw^T-\frac{P(w_2)}{P(w_1)}
g(x)=wTx+w0=wTx−21(m1+m2)TwT−P(w1)P(w2)
g
(
x
)
=
w
T
(
x
−
1
2
(
m
1
+
m
2
)
T
)
−
P
(
w
2
)
P
(
w
1
)
g(x)=w^T(x-\frac{1}{2}(m_1+m_2)^T)-\frac{P(w_2)}{P(w_1)}
g(x)=wT(x−21(m1+m2)T)−P(w1)P(w2)
令
g
(
x
)
=
w
T
(
x
−
1
2
(
m
1
+
m
2
)
T
)
−
P
(
w
2
)
P
(
w
1
)
=
0
g(x)=w^T(x-\frac{1}{2}(m_1+m_2)^T)-\frac{P(w_2)}{P(w_1)}=0
g(x)=wT(x−21(m1+m2)T)−P(w1)P(w2)=0
则最终要考虑的是
w
T
(
x
−
1
2
(
m
1
+
m
2
)
T
)
=
P
(
w
2
)
P
(
w
1
)
w^T(x-\frac{1}{2}(m_1+m_2)^T)=\frac{P(w_2)}{P(w_1)}
wT(x−21(m1+m2)T)=P(w1)P(w2)
如果
w
T
(
x
−
1
2
(
m
1
+
m
2
)
T
)
>
P
(
w
2
)
P
(
w
1
)
w^T(x-\frac{1}{2}(m_1+m_2)^T) > \frac{P(w_2)}{P(w_1)}
wT(x−21(m1+m2)T)>P(w1)P(w2)
则
x
∈
w
1
x\in w_1
x∈w1
如果
w
T
(
x
−
1
2
(
m
1
+
m
2
)
T
)
<
P
(
w
2
)
P
(
w
1
)
w^T(x-\frac{1}{2}(m_1+m_2)^T) < \frac{P(w_2)}{P(w_1)}
wT(x−21(m1+m2)T)<P(w1)P(w2)
则
x
∈
w
2
x\in w_2
x∈w2
更多推荐
所有评论(0)