Intro to Neural Networks

\[ \sigma(z) = \frac{1}{1 + e^{-z}} \] inputs → hidden layer → outputs

Consider a classification task: identify fruit from three features — \(x_1\) = color, \(x_2\) = weight, \(x_3\) = sugar content. We design a network with:

3 input nodes
1 hidden layer with 2 neurons
2 output nodes for two possible fruit types

We use the sigmoid function \(\sigma(z)\) as our activation throughout.

h_1 = \sigma(w_{11}x_1 + w_{12}x_2 + w_{13}x_3 + b_1) \] \[ h_2 = \sigma(w_{21}x_1 + w_{22}x_2 + w_{23}x_3 + b_2) \] \[ o_1 = \sigma(w_{31}h_1 + w_{32}h_2 + b_3)

Each neuron computes a weighted sum of its inputs plus a bias, passed through the activation function. The hidden layer produces values \(h_1, h_2\), which are then combined to produce the outputs \(o_1, o_2\).

Try it out with all weights and biases set to 1, and inputs \([1, 0, 1]\). Press the button to run a full forward pass and see the hidden and output activations.

L = \frac{1}{2}\left[(o_1 - 1)^2 + (o_2 - 0)^2\right] \] \[ \frac{\partial L}{\partial w_{31}} = (o_1 - y_1) \cdot \sigma'(z_{o_1}) \cdot h_1

Suppose our target output is \([1, 0]\). We define a loss function \(L\) and update weights via gradient descent. The derivative of loss with respect to each weight follows the chain rule back through the network, where \(\sigma'(z) = \sigma(z)(1 - \sigma(z))\).

\mathbf{h} = \sigma(\mathbf{W}^{(1)}\mathbf{x} + \mathbf{b}^{(1)}) \] \[ \mathbf{o} = \sigma(\mathbf{W}^{(2)}\mathbf{h} + \mathbf{b}^{(2)})

Instead of writing each term by hand, we group computations into matrices. This notation generalizes immediately to any number of layers and neurons — training uses matrix calculus to update \(\mathbf{W}\) and \(\mathbf{b}\) in batch.

Summary

Neural networks are sequences of matrix multiplications and non-linear activations. Training is the process of reducing error via gradient descent and backpropagation. Though the name sounds complicated, they're layered functions composed of simple pieces — and building one by hand is the clearest way to see that.