Introducing Learning Algorithms
- We use learning algorithms to tune the weights and biases of a network
- In our case, we could use a learning algorithm to solve for the weights and biases in a perceptron
- To see how learning might work, suppose we make a small change in some weight (or bias) in the network
- What we'd like is for this small change in weight to cause only a small corresponding change in the output from the network
- This property is what makes learning possible
Motivating Sigmoid Neurons
- If it were true that a small change in a weight (or bias) causes only a small change in output, then we could use this fact to modify the weights and biases to get our network to behave more in the manner we want
- However, perceptrons don't always have this effect
- Specifically, small changes in a weight or bias can can cause large changes in output
- In fact, a small change in the weights or bias of any single perceptron in the network can sometimes cause the output of that perceptron to completely flip (say from 0 to 1)
- We can overcome this problem by introducing a new type of artificial neuron called a sigmoid neuron
Defining Sigmoid Neurons
- Sigmoid neurons are similar to perceptrons, but modified so that small changes in their weights and bias cause only a small change in their output
- A sigmoid neuron takes in several inputs
- A sigmoid neuron returns a single output between 0 and 1:
- For example, our output could be
- A sigmoid neuron uses a different activation function:
- Sometimes, we specifically refer to the activation function of a sigmoid neuron as the sigmoid function, which can be represented as :
-
Specifically, the output of a sigmoid neuron is the following:
- Where are our weights
- Where is our bias
- Where is our input
Illustrating Sigmoid Neurons
-
We can visually represent a sigmoid neuron as the following:
- Where is a weight
- Where is an input
- Where is the weighted sum function
- Where the output of this function is
- Where is the sigmoid function
- Where the output of this function is
- Where is the output
-
Then, we can further simplify some of the notation:
- Where is a weight
- Where is an input
- Where the output of the weighted sum function is the input of the sigmoid function
- Where is the output
The Behavior of Sigmoid Neurons
- The activation function of a perceptron is a step function
- The sigmoid neuron behaves in a similar way to the perceptron
- If is a large positive number, then and so
- In other words, when is large and positive, the output from the sigmoid neuron is approximately
- On the other hand, if is very negative, then and
- In other words, when is very negative, the output from the sigmoid neuron is approximately
- This behaviour of a sigmoid neuron closely approximates a perceptron
- It's only when is of modest size that there's much deviation from the perceptron model
- This behavior is reflected in the shape of both functions:
Use-Cases of Sigmoid Neurons
- We can use a sigmoid neuron to represent the average intensity of the pixels of an image
- We can use a sigmoid neuron to represent probabilistic output
- We can also use a sigmoid neuron (with a given threshold) to represent binary output
- In other words, we can set a threshold of to reflect the exact behavior of a perceptron
- Compared to a perceptron, this gives us more flexibility to choose whatever threshold we see fit
tldr
- A neural network without an activation function is just a linear function
-
Different activation functions are used if:
- We want to return outputs in a certain range
- Use one that is monotonic
- Use one that has a monotonic derivative
- Use one that approximates identity near the origin
- A sigmoid neuron is similar to perceptrons, but modified so that small changes in their weights and bias cause only a small change in their output
- A sigmoid neuron returns output in the range of
References
Previous
Next