CS 229 notes Supervised Learning
标签(空格分隔): 监督学习 线性代数
Forword
the proof of Normal equation and, before that, some linear algebra equations, which will be used in the proof.
The normal equation
Linear algebra preparation
For two matrices and such that is square, .
Proof:
Some properties:
some facts of matrix derivative:
Proof:
Proof 1:
Proof 2:
Proof:
( refers to the cofactor)
Least squares revisited
(if we don’t include the intercept term)
since ,
Thus,
$\frac{1}{2}(X\theta-\vec{y})^T(X\theta-\vec{y}) = \frac{1}{2}\displaystyle{\sum{i=1}^{m}(h\theta(x^{(i)}) -y^{(i)})^2} = J(\theta) $.Combine Equations :
Hence
Notice it is a real number, or you can see it as a matrix, so
since and involves no elements. then use equation with ,
To minmize , we set its derivative to zero, and obtain the normal equation: