# 构建神经网络

（注意：权重的序号的命名规则是下一层的序号在上一层的序号之前，如为$$w_{41}$$而不是$$w_{14}$$

# 推导前向传播

$y_4=f(\vec{w_4}^T\cdot\vec{x})$

$\vec{w_4} = [w_{b_4}, w_{41}, w_{42}, w_{43}]$

$\vec{x} = \begin{bmatrix} 1 \\ x_1\\ x_2\\ x_3\\ \end{bmatrix}$

$f为激活函数$

$\overrightarrow{y_{隐藏层}}=f(W_{隐藏层}\cdot\vec{x})$

$W_{隐藏层} = \begin{bmatrix} \vec{w_4} \\ \vec{w_5} \\ \end{bmatrix} =\begin{bmatrix} w_{b_4}, w_{41}, w_{42}, w_{43} \\ w_{b_5}, w_{51}, w_{52}, w_{53} \\ \end{bmatrix}$

$\vec{x} = \begin{bmatrix} 1 \\ x_1\\ x_2\\ x_3\\ \end{bmatrix}$

$\overrightarrow{y_{隐藏层}}=\begin{bmatrix} y_4 \\ y_5 \\ \end{bmatrix}$

$\overrightarrow{y_{输出层}}=f(W_{输出层}\cdot\overrightarrow{y_{隐藏层}})$

$W_{输出层} = \begin{bmatrix} \vec{w_6} \\ \vec{w_7} \\ \end{bmatrix} =\begin{bmatrix} w_{b_6}, w_{64}, w_{65} \\ w_{b_7}, w_{74}, w_{75} \\ \end{bmatrix}$

$\overrightarrow{y_{隐藏层}} = \begin{bmatrix} y_4 \\ y_5 \\ \end{bmatrix}$

$\overrightarrow{y_{输出层}}=\begin{bmatrix} y_6 \\ y_7 \\ \end{bmatrix}$

# 推导后向传播

$w_{kj}=w_{kj}-\eta\frac{dE}{dw_{kj}}$

$$net_k$$函数是节点k的加权输入：

$net_k=\overrightarrow{w_k}^T\cdot\overrightarrow{y_{隐藏层}} = \sum_{j} w_{kj}y_j$

\begin{aligned} \frac{dE}{dw_{kj}} & = \frac{dE}{dnet_k}\frac{dnet_k}{dw_{kj}} \\ & = \frac{dE}{dnet_k}\frac{d\sum_{j} w_{kj}y_{j}}{dw_{kj}} \\ & = \frac{dE}{dnet_k}y_{j} \\ \end{aligned}

$\delta_k = \frac{dE}{dnet_k}$

$w_{ji}=w_{ji}-\eta\frac{dE}{dw_{ji}} \\ \frac{dE}{dw_{ji}} = \frac{dE}{dnet_j}x_{i} \\ \delta_j = \frac{dE}{dnet_j}$

$\delta_k=\frac{dE}{dnet_k} = \frac{dE}{dy_k}\frac{dy_k}{dnet_k}$

$\frac{dE}{dy_k} = \frac{dE(\overrightarrow{y_{输出层}})}{dy_k}$

$\frac{dy_k}{dnet_k} = \frac{df(net_k)}{dnet_k}$

$\delta_k = \frac{dE(\overrightarrow{y_{输出层}})}{dy_k}\frac{df(net_k)}{dnet_k}$

\begin{aligned} \delta_j =\frac{dE}{dnet_j} & = \sum_{k\in{输出层}} \quad\frac{dE}{dnet_k}\frac{dnet_k}{dnet_j} \\ & = \sum_{k\in{输出层}} \quad\delta_k\frac{dnet_k}{dnet_j} \end{aligned}

\begin{aligned} \frac{dnet_k}{dnet_j} & = \frac{dnet_k}{dy_{j}}\frac{dy_{j}}{dnet_j} \\ & = \frac{d\sum_{j} w_{kj}y_{j}}{dy_{j}}\frac{dy_{j}}{dnet_j} \\ & = w_{kj}\frac{dy_{j}}{dnet_j} \\ & = w_{kj}\frac{df(net_j)}{dnet_j} \\ \end{aligned}

$\delta_j = \sum_{k\in{输出层}} \quad\delta_k w_{kj}\frac{df(net_j)}{dnet_j}$

# 推导权重和偏移更新

\begin{aligned} w_{kj} =w_{kj}-\eta\delta_k y_j \end{aligned}

\begin{aligned} w_{ji} & =w_{ji}-\eta\delta_j x_i \end{aligned}

# 参考资料

原文作者：Wonder-YYC
原文地址: https://www.cnblogs.com/chaogex/p/16350439.html
本文转自网络文章，转载此文章仅为分享知识，如有侵权，请联系博主进行删除。