8 Multiple Features 多特征
x
j
=
j
t
h
x_j=j^{th}
xj=jth feature
n
=
n=
n= number of features
x
⃗
(
i
)
=
\vec{x}^{(i)}=
x(i)= features of
i
t
h
i^{th}
ith training example
x
j
(
i
)
=
x^{(i)}_j=
xj(i)= value of feature
j
j
j in
i
t
h
i^{th}
ith training examples
Model:
Previously:
f
w
,
b
(
x
)
=
w
x
+
b
f_{w,b}(x)=wx+b
fw,b(x)=wx+b
now:
f
w
,
b
(
x
)
=
w
1
x
1
+
w
2
x
2
+
.
.
.
+
w
n
x
n
+
b
f_{w,b}(x)=w_1x_1+w_2x_2+...+w_nx_n+b
fw,b(x)=w1x1+w2x2+...+wnxn+b
w
⃗
=
[
w
1
w
2
w
3
.
.
.
w
n
]
\vec{w}=[w_1 w_2 w_3 ... w_n]
w=[w1w2w3...wn]
b
b
b is a number
w
⃗
\vec{w}
w and
b
b
b are parameters of the model.
x
⃗
=
[
x
1
x
2
x
3
.
.
.
x
n
]
\vec{x}=[x_1 x_2 x_3 ... x_n]
x=[x1x2x3...xn]
f
w
⃗
,
b
(
x
⃗
)
=
w
⃗
⋅
x
⃗
+
b
f_{\vec{w},b}(\vec{x})=\vec{w}\cdot\vec{x}+b
fw,b(x)=w⋅x+b
multiple linear regression 多元线性回归
9 Vectorization 向量化
Ex.
w
⃗
=
[
w
1
w
2
w
3
]
\vec{w}=[w_1 w_2 w_3]
w=[w1w2w3]
b
b
b is a number
x
⃗
=
[
x
1
x
2
x
3
]
\vec{x}=[x_1 x_2 x_3]
x=[x1x2x3]
linear algebra: count from 1
w = np.array([1.0, 2.5, -3.3])
b = 4
x = np.array([10, 20, 30])
code: count from 0
WIthout vectorization
f
w
⃗
,
b
(
x
⃗
)
=
w
1
x
1
+
w
2
x
2
+
w
3
x
3
+
b
f_{\vec{w},b}(\vec{x})=w_1x_1+w_2x_2+w_3x_3+b
fw,b(x)=w1x1+w2x2+w3x3+b
f = w[0] * x[0] + w[1] * x[1] + w[2] * x[2] + b
f w ⃗ , b ( x ⃗ ) = ∑ j = 1 n w j x j + b f_{\vec{w},b}(\vec{x})=\sum_{j=1}^nw_jx_j+b fw,b(x)=∑j=1nwjxj+b
f = 0
for j in range(0, n):
f = f + w[j] * x[j]
f = f + b
Vectorization
f
w
⃗
,
b
(
x
⃗
)
=
w
⃗
⋅
x
⃗
+
b
f_{\vec{w},b}(\vec{x})=\vec{w}\cdot\vec{x}+b
fw,b(x)=w⋅x+b
f = np.dot(w, x) + b
Two distinct benefits:
1.It makes the code shorter.
2.It results in your code running much faster.
Gradient descent
w
⃗
=
(
w
1
w
2
.
.
.
w
16
)
\vec{w}=(w_1 w_2 ... w_{16})
w=(w1w2...w16)
derivatives:
d
⃗
=
(
d
1
d
2
.
.
.
d
16
)
\vec{d}=(d_1 d_2 ... d_{16})
d=(d1d2...d16)
(without
b
b
b)
w = np.array([0.5, 1.3, ..., 3.4])
d = np.array([0.3, 0.2, ..., 0.4])
compute
w
j
=
w
j
−
0.1
d
j
w_j = w_j -0.1d_j
wj=wj−0.1dj for
j
=
1...16
j=1...16
j=1...16
Without vectorization
w
1
=
w
1
−
0.1
d
1
w_1=w_1-0.1d_1
w1=w1−0.1d1
w
2
=
w
2
−
0.1
d
2
w_2=w_2-0.1d_2
w2=w2−0.1d2
⋮
\quad\quad\quad\vdots
⋮
w
16
=
w
16
−
0.1
d
16
w_{16}=w_{16}-0.1d_{16}
w16=w16−0.1d16
for j in range(0, 16):
w[j] = w[j] - 0.1 * d[j]
With vectorization
w
⃗
=
w
⃗
−
0.1
d
⃗
\vec{w}=\vec{w}-0.1\vec{d}
w=w−0.1d
w = w - 0.1 * d
学习来源:吴恩达机器学习课程
400

被折叠的 条评论
为什么被折叠?



