第四次上机作业:
sigmoidGradient.m
randInitializeWeights.m
nnCostFunction.m
sigmoidGradient.m
一个简单的求偏导数的公式
g = sigmoid(z) .* (1 - sigmoid(z));
randInitializeWeights.m
随机初始权重函数。
epsilon_init = 0.12;
W = rand(L_out, 1 + L_in) * 2 * epsilon_init - epsilon_init;
nnCostFunction.m
后向传播算法的应用
第一步,先用前向传播求出各层的特征值;
a1 = [ones(m,1), X];
z2 = a1 * Theta1';
a2 = sigmoid(z2);
a2 = [ones(m, 1),a2];
z3 = a2 * Theta2';
a3 = sigmoid(z3);
Y = zeros(m, num_labels);
%将y变成矩阵形式
for i = 1:m
Y(i, y(i)) = 1;
end
第二步,求出代价函数;
Jk = zeros(m, 1);
Jk = sum(-Y.* log(a3) - (1 - Y) .* log(1 - a3), 2);
J = sum(Jk) /m;
J += lambda / (2 * m) *(sum(sum(Theta1(:, 2:end).^2)) + sum(sum(Theta2(:, 2:end).^2)));
第三步,用后向传播算法得出梯度。
delta3 = a3 - Y;
delta2 = delta3 * Theta2 .* (a2.*(1-a2));
delta2 = delta2(:, 2 : end);
Delta2 = zeros(size(delta3, 2), size(a2, 2));
Delta1 = zeros(size(delta2, 2), size(a1, 2));
for i = 1:m
Delta2 = Delta2 + delta3(i, :).' * a2(i, :);
Delta1 = Delta1 + delta2(i, :).' * a1(i, :);
end
Theta1_grad = Delta1 / m;
Theta1_grad(:, 2 : end) = Theta1_grad(:, 2 : end) + Theta1(:, 2:end) *(lambda / m);
Theta2_grad = Delta2 / m;
Theta2_grad(:, 2 : end) = Theta2_grad(:, 2 : end) + Theta2(:, 2:end) *(lambda / m);