Chaos World - Sharing & Coding

Neural networks, a beautiful biologically-inspired programming paradigm which enables a computer to learn from observational data

Perceptron

A perceptron takes several binary inputs, x1,x2,…, and produces a single binary output:

SGD

To quantify how well we’re achieving this goal we define a cost function: $C(w,b)=\frac1{2n}\sum_x{\left|y(x)-a \right|^2}$ Here, w denotes the collection of all weights in the network, b all the biases, n is the total number of training inputs, a is the vector of outputs from the network when x is input, and the sum is over all training inputs, x. Of course, the output a depends on x, w and b, but to keep the notation simple I haven’t explicitly indicated this dependence. The notation ‖v‖ just denotes the usual length function for a vector v. We’ll call C the quadratic cost function; it’s also sometimes known as the mean squared error or just MSE.

Back Propagation

\[\delta^l_j=\frac{\partial C}{\partial a^l_j}\sigma^\prime(z^l_j)\] \[\delta^l=((\omega^{l+1})^T\delta^{l+1})*\sigma^\prime(z^l)\] \[\frac{\partial C}{\partial b^l_j}=\delta^l_j\] \[\frac{\partial C}{\partial \omega^l_{jk}}=a^{l-1}_k\delta^l_j\]

Initialnization

Then we shall initialize those weights as Gaussian random variables with mean 0 and standard deviation $\frac1{\sqrt{n_{in}}}$

Cross-entropy Cost Function

\[C=-\frac1n\sum_x[y\ln a+(1-y)\ln(1-a)]\]

where n is the total number of items of training data, the sum is over all training inputs , x, and y is the corresponding desired output.

Softmax

\[a^L_j=\frac{e^{z^L_j}}{\sum_ke^{z^L_k}}\]

Regularization

\[\frac{\lambda}{2n}\sum_ww^2\]

Dropout

Dropout is a radically different technique for regularization.

Hyperparameter

learining rate $\eta$ regularization parameter $\lambda$ mini-batch size

Activate Function

sigmoid:

\[\sigma(z)=\frac1{1+e^{-z}}\]

tanh:

\[tanh(z)=\frac{e^z-e^{-z}}{e^z+e^{-z}}\]

relu(rectified linear unit):

\[max(0,z)\]

Gradient Lost

Gradient will change less in front layers.

CNN

Convolutional Neural Network

RNN

Recurrent Neural Network

LSTM

Long-Short Term Memory Unit

DBN

Deep Belief Network

GAN

Generative Adversarial Networks

Reference

Neural Network And Deep Learning

微信（WeChat Pay）	支付宝（AliPay）

比特币（Bitcoin）	以太坊（Ethereum）

以太坊（Base）	索拉纳（Solana）