The CORRECT calculation of the no. of parameters per layer is given as: 1. Conv1 layer: We have 6 filters of size (5*5*3) [6 filters as per the example by Andrew earlier and not as per the table] . For each filter convolution operation with the image input we will have a bias, so 6 biases. Total params: (5*5*3 + 1) * 6 = 456 2. Conv2 layer: We have 16 filters of size (5*5*6) . For each filter convolution operation with the image input we will have a bias, so 16 biases. Total params: (5*5*6 + 1) * 16 = 2416 3. FC3 layer: We have 400 inputs and 120 neurons . Each neuron will be associated with 1 bias of its own. Total params: 400*120 + 120 = 48120. 4. FC4 layer: For this layer we have 120 inputs [from FC3] and 84 neurons of FC4 layer itself. Each neuron will be associated with 1 bias of its own. Total params: 120*84 +84 = 10164 5. The sigmoid /Output layer: For this layer we have 84 inputs [from FC4] and 10 neurons of the output layer itself. Each neuron will be associated with 1 bias of its own. Total params: 84*10 + 10 =850. Total no. of parameters = 456 + 2416 + 48120 + 10164 + 850 = 62006. Hope it helped ;-)
@EngineeredFemale
2 жыл бұрын
It really did help. Thanks man
@Gajendra_Singh22
Жыл бұрын
thanks
@SanjeetKumar-ef6fj
Жыл бұрын
Shouldn't in first layer, parameters be calculated as (5*5+1)*6*3=468?
@objecttracking31
Жыл бұрын
Below is the Keras implementation of AlexNet, please anybody can solve how they get Dense (4096) in th FC layer 2. The code is given as follow. model = tf.keras.models.Sequential([ # 1st conv tf.keras.layers.Conv2D(96, (11,11),strides=(4,4), activation='relu', input_shape=(227, 227, 3)), tf.keras.layers.BatchNormalization(), tf.keras.layers.MaxPooling2D(2, strides=(2,2)), # 2nd conv tf.keras.layers.Conv2D(256, (11,11),strides=(1,1), activation='relu',padding="same"), tf.keras.layers.BatchNormalization(), # 3rd conv tf.keras.layers.Conv2D(384, (3,3),strides=(1,1), activation='relu',padding="same"), tf.keras.layers.BatchNormalization(), # 4th conv tf.keras.layers.Conv2D(384, (3,3),strides=(1,1), activation='relu',padding="same"), tf.keras.layers.BatchNormalization(), # 5th Conv tf.keras.layers.Conv2D(256, (3, 3), strides=(1, 1), activation='relu',padding="same"), tf.keras.layers.BatchNormalization(), tf.keras.layers.MaxPooling2D(2, strides=(2, 2)), # To Flatten layer tf.keras.layers.Flatten(), # To FC layer 1 tf.keras.layers.Dense(4096, activation='relu'), # add dropout 0.5 ==> tf.keras.layers.Dropout(0.5), #To FC layer 2 tf.keras.layers.Dense(4096, activation='relu'), # add dropout 0.5 ==> tf.keras.layers.Dropout(0.5), tf.keras.layers.Dense(output_class_units, activation='softmax') ])
@MrAmgadHasan
Жыл бұрын
@@SanjeetKumar-ef6fj I don't think so. Each filter has only one bias associated with it. Notice that the filter has the same depth as its input. So if an input is 3 channels, each filter will have 3 channels and it will have only one bias.
@aymannaeem22
4 жыл бұрын
Getting the # parameters If we applied it on the NN example (LeNet-5) For Conv 1: where f=5, # of filters=6 not 8 , # previous channels = 3 Rule : (f * f * # previous channels + bias ) *# of filters = (5*5 *3 +1) * 6 = 456 For Conv 2: where f=5, # of filters =16 , # previous channels = 6 Rule : (f * f * # previous channels + bias ) * # of filters = (5*5 *6 +1) * 16 = 2416
@zql7351
3 жыл бұрын
You are right. It is actually what happens in Keras for 3-channel images. Plus, the number of parameters for FC3, FC4, and softmax are 48120, 10164, and 850 resp.
@aninditadas832
3 жыл бұрын
thank you so much for this correction else I was under the assumption that everything I have followed until now is wrong :D
@felixpotter6420
3 жыл бұрын
cheers fam, i was having a bit of a mental breakdown but uve alleviated my mind.
@krranaware8396
3 жыл бұрын
Is it like input image is 32*32*1 and not 32*32*3?
@govindashrestha190
3 жыл бұрын
You are right. Parameters in videos are wrong.
@subodhsharma3038
4 жыл бұрын
@Andrew sir, I think some CORRECTION is needed in Neural network example, for calculation number of parameters in CONV and FC layer. I hope below steps will help someone: ## Formula to calculate number of parameters in CONV and FC layer: ======================================================== Formula 1: ---------- ------- Number of parameters in CONV layer = ((m*n*c)+1)*k where m = shape of width of the filter, n = shape of height of the filter, c = number of channels k = number of filters and add bias 1 for each filter Formula 2: ---------- ------- Number of parameters in FC layer = (Number of input +1) * Number of output ########################################################## Example-1: Using above Formula 1 in CONV1: ---------- --------------------------------------------------------------- Given: (f=5,s=1), number of channels = c = 3 [from (32,32,3)]and number of filters = k = 8 [from (28,28,8)] Here, filter shape = f= 5, means f = m*n = 5 * 5 Therefore, Number of parameters in CONV1 layer= ((m*n*c)+1)*k = ((5*5*3)+1)*8 = ((75)+1)*8 = (76)*8 = 608 Example-2: Using above Formula 1 in CONV2: ---------- --------------------------------------------------------------- Given: (f=5,s=1), number of channels = c = 8 [from (14,14,8)]and number of filters = k = 16 [from (10,10,16)] Here, filter shape = f= 5, means f = m*n = 5 * 5 Therefore, Number of parameters in CONV2 layer= ((m*n*c)+1)*k = ((5*5*8)+1)*16 = ((200)+1)*16 = (201)*16 = 3216 Example-3: Using above Formula 2 in FC3: ------------------------------------------------------------------- Given: Number of input = 400 and Number of ouput = 120 [from (120,1)] Therefore, Number of parameters in FC3 layer = (Number of input + 1) * Number of output = (400+1)*120 = (401)*120 = 48120 Example-4: Using above Formula 2 in FC4: -------------------------------------------------------------------- Given: Number of input = 120 and Number of ouput = 84 [from (84,1)] Therefore, Number of parameters in FC4 layer = (Number of input + 1) * Number of output = (120+1)*84 = (121)*84 = 10164 Reference: stanford.edu/~shervine/teaching/cs-230/cheatsheet-convolutional-neural-networks
@bharshavardhan2007
4 жыл бұрын
towardsdatascience.com/understanding-and-calculating-the-number-of-parameters-in-convolution-neural-networks-cnns-fc88790d530d You are right. The link has the corrected parameters. You can have a look at it if you want.
@1984ssn
4 жыл бұрын
in the CONV layer need to modify the no parameter
@mirabirhossain1842
2 жыл бұрын
I think there is a catch. If you are calculating with 3 channels, then CONV1 layer should be 28*28*3*8. Why are you not correcting this?
@arjunroyihrpa
5 жыл бұрын
Your videos are great... I lost count how many times i came back to your channel, whenever I feel any problem in the basics of deep nn while doing some new research, I juz come to your channel. I have been doing these for the last 1year. You Sir is really great...
@manuel783
3 жыл бұрын
CNN Example *CORRECTION* Starting from 9:45, please note that the calculation of the number of parameters is incorrect. Here are the 5 typos: 1. 208 should be (5*5*3 + 1) * 8 = 608 2. 416 should be (5*5*8 + 1) * 16 = 3216 3. In the FC3, 48001 should be 400*120 + 120 = 48120, since the bias should have 120 parameters, not 1 4. Similarly, in the FC4, 10081 should be 120*84 + 84 (not 1) = 10164 (Here, the bias is for the fully connected layer. In fully connected layers, there will be one bias for each neuron, so the bias become In FC3 there were 120 neurons so 120 biases.) 5. Finally, in the softmax, 841 should be 84*10 + 10 = 850
@zachshaffer44
2 жыл бұрын
These are wrong too, the correct calculations are in a comment below
@turzobose40
3 жыл бұрын
Here are the 5 typos: (Note: the first CONV layer in the table on the last slide has 8 filters, not 6 filters as drawn by Andrew in the second last slide) 1. 208 should be (5*5*3 + 1) * 8 = 608 2. 416 should be (5*5*8 + 1) * 16 = 3216 3. There should be 1 bias parameter for every neuron for the fully-connected layers. Hence, FC3 will have 120 bias params for 120 neurons and FC4 should have 84 bias params for 84 neurons. So, FC3 => 400 * 120 + 120 = 48120 &, FC4 => 120 * 84 + 84 = 10164 4. FInally the softmax layer should be 84*10 + 10(for bias 1param/neuron) = 850
@FritzKissa
2 жыл бұрын
One could argue that Softmax doesn't have any learnable parameters, and there's FC5 layer missing where the parameter count would be 850.
@vivekv6764
2 жыл бұрын
For anyone reading the comments, this comment here (by Turzo Bose) has the 100% correct calculations. The basic problem is that the figure he draws & the table (at the end) don't match. So, ust follow the table at the end and you'll get the correct calculations.
@aravindhankrishnan1300
6 жыл бұрын
For fully connected layers, why is there only a single bias (or) at least that is what I infer from the calculations. For example, in FC3, Input: 400 dimension flattened vector Number of blocks in FC3 = 120 So I would expect the number of parameters to learn = 400 * 120 (weights) + 120 (bias) = 48120 But is it given as 400 * 120 + 1 = 48001. Am I missing any trivial or key point or is that a genuine mistake?
@kristenm6
6 жыл бұрын
I think it is a mistake---I got 48120 for the number of parameters as well. He states that the number of biases for FC3 should be 120 at 6:20. I think the number of parameters for FC4 is also wrong---I got 10164 for this.
@LeenaGurgPhysics
4 жыл бұрын
The number of parameters are all correct. I thought of it as a set of 400 points that has a set of 120 points behind it. There are 400 x120 =48000 connections possible. Now just add 1 more connection(bias) to the 400 points. So, now you have 48001 as the number of parameters.
@1xaps
4 жыл бұрын
@@kristenm6 I think that too.
@migueloteropedrido6519
3 жыл бұрын
@@LeenaGurgPhysics Nope. That's incorrect. A neuron just consists of a linear combination + a nonlinear activation. A linear combination is just w1*x1 + w2*x2 + ... +wn*xn+ b, where b is the bias. Then we will apply a nonlinear activation (such as relu) to the result. That "b" is the bias, and you have one for every neuron in the layer. So, if you have 400 inputs and a layer with 120 neurons the total number of parameters are 400*120 + 120 = 48120.
@SMShaon-uf4lr
6 жыл бұрын
in conv1, filter size will be 5*5*3 because input activation shape is 32*32*3, so in conv1, filter size is 5*5*3 and parameter will be 5*5*3 = 75 + 1(biase) = 76. conv1 activation shape is 28*28*8 means it has 8 filters and parameters is 76 * 8 = 608.. is not it right??
@KevinKuei
6 жыл бұрын
I have the same question too. :)
@pehpa
6 жыл бұрын
He seems to share the weights over in the input layers, i.e. one single filter in CONV1 has 5x5+1=26 parameters. In total, i.e. for 8 filters, this results in 8x5x5+8=208 parameters. This way, R, G, and B input channels get convolved with the same number of weights. But you could easily have different weights for each input layer, e.g. as is used per default in Keras layers. Then, you'd have 8x5x5x3+8=608 parameters. However, there is on additional problem with the parameters for the fully connected layers here. For FC3 it should actually be 120x400+120=48120 parameters, since you'd have one weight per bias for each output node. The same goes for FC4 where it should actually be 10164 parameters and the output layer, where it should be 850 parameters.
@sukeshadiga7130
6 жыл бұрын
Yes, you are right. It should be (5*5*3+1) * 8 = 608.
@marr73
6 жыл бұрын
the table might be using a 1D image not RGB as he used 5*5+1*8 to get 208 I am also confused in the drawing he used 28x28x6 but the table listed 28x28x8
@pritipachpande413
6 жыл бұрын
Can anyone explain how this calculation is done like how we can calculate number of filters ?
@mohammadserdar5351
2 жыл бұрын
wrong wrong wrong !!!
@eshandas3645
Ай бұрын
Did he skipped fully connected layers
@billykotsos4642
3 жыл бұрын
The Myth, the Legend
@afrinsultana5972
2 жыл бұрын
I have shortage of words to admire you. Thank you sir.
@Vinay1272
4 жыл бұрын
Why is the number of channels = 6 at Conv1?
@objecttracking31
Жыл бұрын
Below is the Keras implementation of AlexNet, please anybody can solve how they get Dense (4096) in th FC layer 2. The code is given as follow. model = tf.keras.models.Sequential([ # 1st conv tf.keras.layers.Conv2D(96, (11,11),strides=(4,4), activation='relu', input_shape=(227, 227, 3)), tf.keras.layers.BatchNormalization(), tf.keras.layers.MaxPooling2D(2, strides=(2,2)), # 2nd conv tf.keras.layers.Conv2D(256, (11,11),strides=(1,1), activation='relu',padding="same"), tf.keras.layers.BatchNormalization(), # 3rd conv tf.keras.layers.Conv2D(384, (3,3),strides=(1,1), activation='relu',padding="same"), tf.keras.layers.BatchNormalization(), # 4th conv tf.keras.layers.Conv2D(384, (3,3),strides=(1,1), activation='relu',padding="same"), tf.keras.layers.BatchNormalization(), # 5th Conv tf.keras.layers.Conv2D(256, (3, 3), strides=(1, 1), activation='relu',padding="same"), tf.keras.layers.BatchNormalization(), tf.keras.layers.MaxPooling2D(2, strides=(2, 2)), # To Flatten layer tf.keras.layers.Flatten(), # To FC layer 1 tf.keras.layers.Dense(4096, activation='relu'), # add dropout 0.5 ==> tf.keras.layers.Dropout(0.5), #To FC layer 2 tf.keras.layers.Dense(4096, activation='relu'), # add dropout 0.5 ==> tf.keras.layers.Dropout(0.5), tf.keras.layers.Dense(output_class_units, activation='softmax') ])
@bannourbouakkez7905
9 ай бұрын
so that's it for pooling hh , love
@vivekv6764
2 жыл бұрын
For anyone coming to this video as of 21 May 2022, the numbers in the figure & the table (at the end) don't match. The correct calculations are in the comment posted by Turzo Bose.
@mikiyaszelalem3872
10 ай бұрын
I dont understand how the parameters came to be 208, .... in the conv layer
@lovemormus
5 жыл бұрын
for those who cant understand why the conv1 has 208 parameters, in this conv1 layer, the filter size is 5*5 which means that there are 5*5+1(the bias) parameters, and then since we have 8 filters(see the last element of the activation shape), we then have (5*5+1)*8 parameters to train.
@sahhaf1234
5 жыл бұрын
Yes, I got this.. But if we take the beginning of the lecture as a reference, filter size must be 5*5*3+1, NOT 5*5+1.. Hence number of parameters must be [5*5*3+1]*8=608, not 208. I think there is a mistake here..
@vijagish01
5 жыл бұрын
@@sahhaf1234 I guess as someone mentioned above. He has shared the weights for the 3 input channels. So that's why it's 5*5 + 1
@franco521
4 жыл бұрын
@@vijagish01 that's probably the case, but it should have been clarified because it's very unsettling.
@WilliamVoje
4 жыл бұрын
@@vijagish01 How do you deal with 3 channels sharing a single filter? Do you sum across the channels before you apply the filter?
@RH-mk3rp
2 жыл бұрын
For future viewers, the table of values is magnificently wrong and most likely not created by Andrew Ng. The correct values are in the comments. As for the 208 number, it is actually filter (5, 5) x input_nc (3) x nc (6 filters) + bias (6 filters) = 5 x 5 x 3 x 6 + 6 = 456 parameters
@hermonjay4744
6 жыл бұрын
At 9:29. How to get the number of parameters? And I don't really get the intuition, how can activation shape is 28x28x8? Why not 28x28x5? I thought the third number is a number of filters?
@kunhongyu5053
6 жыл бұрын
Hermon Jay The filter size is 5x5x3, the number of filters is 8
@stickmanjournal
2 жыл бұрын
I don't understand why he said as the layer goes deeper the channels increase. Shouldn't the channels always have the same value?
@viditsharma3929
2 жыл бұрын
I think the parameters are all wrong. i mean their count as per your teachings
@thecandel5479
4 жыл бұрын
Thanks very much! But I hope from you to explain how numbers change from layer to layer. 🌹
@zeesamuel9885
4 жыл бұрын
Hi Read, you have to go to Lecture 1 and also there is courses in Coursera by Andrew.
@sarfrazahmed5213
7 ай бұрын
sir how do you add the number of channels 3 6 6 and 16?
@ati43888
6 ай бұрын
Thanks
@chamanthipyneni7827
Жыл бұрын
How many layers does a cnn need to have for 4 class labels?
@timharris72
5 жыл бұрын
a lot of the comments here are asking for real examples. He talks about the lenet in this video. The pyimagesearch blog has an example of this net implemented in Keras any Python. I think it will answer most of the questions here. www.pyimagesearch.com/2016/08/01/lenet-convolutional-neural-network-in-python/
@saramessara4241
3 жыл бұрын
why is softmax used only in the output layer?
@ECB-SanjayReddy
Жыл бұрын
the 120 and 84 we took in FC3 AND FC4 is it our choice like we can choose any? and also the softmax value can we choose anything other than 10?
@eshandas3645
Ай бұрын
Softmax value is no of out put class, here it's 0 to 9, so 10
@thuandiec5756
2 жыл бұрын
I dont know which is 120?
@debarunkumer2019
4 жыл бұрын
In the table shown at the last part of the video. The parameter is calculated om multiplying the preceding and present activation size and adding one to the product. That is what I have inferred from the table. What is the significance of adding one to the product ? For instance, considering input FC3, the parameter size is determined by taking the product of (120*400)+1 = 48001. Why 1 is being added to the result?
@gowthamarun43
3 жыл бұрын
it is wrong it should be 48120, 120*400 + 120. the 120 which is added is the bias for 120 hidden units
@siddhantpathak3162
4 жыл бұрын
Awesome video, but a hell lot of mistakes in the table
@nurjamil6381
6 жыл бұрын
in fully connected 2 and 3, what type of activation function it used ?
@timharris72
5 жыл бұрын
Originally they used tanh for the lenet. Now they use relu.
@elgs1980
4 жыл бұрын
What filters are the 6 filters? I mean what numbers should be filled in the filters?
@ThePaypay88
3 жыл бұрын
its number of features you would want to learn. It's trial & error number . Before deep learning they were doing manually image procesing filters so you would put what you want ( aka if you want to detect horizontal edge you would add one filter for that , for vertical add one more , for blurriines add one etc.) 6 hopes it finds 6 features like this.
@elgs1980
4 жыл бұрын
Are the layer of 120 and layer of 84 hidden layers?
@nehasoni6235
5 жыл бұрын
I have confusion un getting the number of parametres.
please anyone, why 5 in the first convolution layer ( why 32-5+1 = 28). '5' from where ??? please help me
@tejasvigupta07
4 жыл бұрын
its the formula (n-l+1) n=32, l=5
@NguyenNhan-yg4cb
4 жыл бұрын
@@tejasvigupta07 bro 32 is the input image 32x32, and what about 5 ?
@tejasvigupta07
4 жыл бұрын
5 comes from the size of filter. Here it is a 5x5 filter or f=l=5.
@NguyenNhan-yg4cb
4 жыл бұрын
@@tejasvigupta07 bro thanks so much, what about 120 and why 16@5x5 become 120 and 120 become 84. i can understand 10 maybe the label (0,1,2,3,4,5,6,7,8,9)
@tejasvigupta07
4 жыл бұрын
When we make a layer we get to decide the number of units we want in a hidden layer. What I mean is that our hidden layer can have 5 units and I can connect it next to a layer with 10 units or even 2 units. When we make a layer in neural network it's up to to us to select the number of units in that.
@abhishekshankar1136
4 жыл бұрын
9:50 correction - no of parameters in CONV1 and CONV2 is (5x5x3+1)x8 = 608 & (5x5x8+1)x16=3216 respectively FC3 is 120(480+1)=57720 & FC4 is 84*(120+1)=10164
@edmundchan8923
5 жыл бұрын
May I know why from Pool 1 to Conv2 the size is changed to 10x10x16?
@wiLsonChoO93
5 жыл бұрын
using his formula, (n+2p-f)/s ((input size)+2(padding)-(filter size))/stride (14+2(0)-5 )/ 1 = 10, 16 is just the number of filters/ kernels he decided to use (hyperparameter)
@shannonliu3382
4 жыл бұрын
@@wiLsonChoO93 and the 16 is because of Ng use 16 filters
@wiLsonChoO93
4 жыл бұрын
@@shannonliu3382 yes
@madhavtibrewal4924
3 жыл бұрын
But didn't he say number of filters used in the previous layer will determine the number of channels in present layer?
@i_amdosa3068
5 жыл бұрын
day : 1
@nagarajnagu7714
4 жыл бұрын
Plzz any one How depth value changeing from 6 to 16
@bharshavardhan2007
4 жыл бұрын
6 or 16 is the number of filters used. so for each layer its our wish to use number of filters which means, we can use different filters to get different types of features like vertical edges, horizontal edges.
@bharshavardhan2007
4 жыл бұрын
watch again at 1.05 point of time, he mentioned he is using 6 filters
@nagarajnagu7714
4 жыл бұрын
Before that He using total 5filters right.. Wt is the use of using filters again
@nagarajnagu7714
4 жыл бұрын
Getting 28 value by using formula Right.. How 6 I am not getting.. In the calculation we are filter value 5 than how it's 6??
@bharshavardhan2007
4 жыл бұрын
@@nagarajnagu7714 f = 5 means one filter size is 5 x 5. It doesn't mean 5 filters are used. in the video after 1.05 he says he is using 6 filters.
@mostinho7
5 жыл бұрын
Personal TODO:- @1:47 check that pooling layer output follows same formula as normal filter
@WahranRai
6 жыл бұрын
Too much theoritical your example. why not showing the data (filters etc...)
Пікірлер: 117