My 3 months with Computer Vision — Part 3— Simple Neural Network from scratch for MNIST

Angadpreet Nagpal
2 min readApr 18, 2021

Let’s start with Project 1 — MNIST dataset. MNIST is dataset of handwritten digits. These are 28, 28 pixel images and you need to predict what digit it is between 0 and 9.

1. Loading The Data

Let’s start by downloading the data. The data is from Kaggle https://www.kaggle.com/c/digit-recognizer .We will also create a submission for Kaggle and see how our model performs.

You can start by cloning my repository https://github.com/angadp/DeepLearning

Let’s load the data and see a few Samples:

MNIST visualized

2. Augmenting the data

We will use the ImageDataGenerator function for this. Basically you pass images to this function and it gives you the same images but with the images augmented. This is done to make sure that the right features are learnt. You are basically learning the same images but in a different way.

3. Model

Let’s go to the model now. If you have read the previous article, we need to create a convolution i.e. a model that starts big and becomes small.

Please go through the Keras documentation, you can experiment with more perimeters of this Con2D Layers and MaxPooling Layers. Just make sure that the no of parameters are decreasing every layer. If you visualize this model you will see what I mean,

Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 26, 26, 256) 2560
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 13, 13, 256) 0
_________________________________________________________________
conv2d_1 (Conv2D) (None, 11, 11, 128) 295040
_________________________________________________________________
batch_normalization (BatchNo (None, 11, 11, 128) 512
_________________________________________________________________
conv2d_2 (Conv2D) (None, 9, 9, 96) 110688
_________________________________________________________________
conv2d_3 (Conv2D) (None, 7, 7, 64) 55360
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 3, 3, 64) 0
_________________________________________________________________
dropout (Dropout) (None, 3, 3, 64) 0
_________________________________________________________________
flatten (Flatten) (None, 576) 0
_________________________________________________________________
dense (Dense) (None, 128) 73856
_________________________________________________________________
dropout_1 (Dropout) (None, 128) 0
_________________________________________________________________
dense_1 (Dense) (None, 32) 4128
_________________________________________________________________
dense_2 (Dense) (None, 10) 330
=================================================================
Total params: 542,474
Trainable params: 542,218
Non-trainable params: 256
_________________________________________________________________

As you can see, the number of parameters are decreasing as we go from starting layer to another. Now the work will be done by the training method.

4. Compile

We are predicting categories, so the loss is cateogorial_crossentropy.

5. Training and Visualizing

You define the batch size and epochs and the rest is taken care by Keras.

After training we will see a graph like this.

6. Predicting

You can use the model you have trained to predict new items as shown below.

Next let’s move on CatsVDogs.

--

--

Angadpreet Nagpal
0 Followers

Senior Software Engineer at M56Studios. Interested in mobile and web development and Deep Learning.