My 3 months with Computer Vision — Part 2 — Understanding Neural Network and Layers

Neural Network

What are Neural Networks?

Neural Network in very simple terms is a bunch of neural which perform transformations from one variable to other. These nodes are arranged together in the hidden layer. The values flow from the input to the output and the error that comes is then backpropagated to the input layers. This is known as the Dense Layer and is the foundation of Neural Networks. So we already have 2 important parameters to understand for training a neural network —

  1. Error

What is Deep Learning?

Deep Learning is just adding some layers before the dense layers which try to detect the features which can be trained. For example, before these layers were introduced, Software Engineers used to have to program features into the neural network. Now these can be auto detected using the following layers.

Understanding Convolution

Convolution means that the input starts large and multi dimensional and is reduced to smaller output. So you start by adding Convolutional layers and you keep reducing the size of the input to be processed every layer.

Computer Vision Layers

1.1 Conv2D

Conv2D is a list of filters which you pass over the image or the image array and that filtered output is now the input to the next layer. These filters are slid over the array and multiplied with the layer below. You can read about the parameters you can pass in the Keras documentation.

Convolution Visualized

1.2 MaxPooling2D

MaxPooling2D is a pooling layer and is much simpler to explain. All this does is given height and width, find the max value from that array. The filter is slid over the beneath layer just like before.

MaxPooling2D Visualized

1.3 Normalizing & Regularization Layers

We need these layers to keep the model robust to change. The model might get into a wrong tangent and think that some parameter is really important. Dropout drops some nodes at random. This makes sure that the model is learning all nodes equally and not giving more importance to some nodes and ignoring others. Normalization does a similar work by normalizing the inputs from the previous layer.

1.4 Flatten

Flatten just makes the multi dimensional array to single dimensioned.

Augment the data

Augmentation is necessary to make sure that the model is not just memorizing data and is generalizing. The different kind of augmentations are: zooming the image, x variation, y variation, etc. You can read more augmentations here. In part 6, I describe augmentation more in detail as we create our own.

Compiling the model and defining the loss and optimizer

These are parameters to the model. The loss defines the difference between the actual and prediction which needs to be backpropagated. We have discussed in the start of the article.

Training the model

Here you can change the batch size. The batch is the number of images the loss will be accumulated over before backpropagating. With the above steps that is shuffling the images, augmenting the images and adding Dropout and Batch Normalization to the model, this last steps also ensures generalization of the model.

Senior Software Engineer at M56Studios. Interested in mobile and web development and Deep Learning.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store