反向传递法则是深度学习中最为重要的一部分,torch中的backward可以对计算图中的梯度进行计算和累积
这里通过一段程序来演示基本的backward操作以及需要注意的地方
>>> import torch
>>> from torch.autograd import Variable
>>> x = Variable(torch.ones(2,2), requires_grad=True)
>>> y = x + 2
>>> y.grad_fn
Out[6]: <torch.autograd.function.AddConstantBackward at 0x229e7068138>
>>> y.grad
>>> z = y*y*3
>>> z.grad_fn
Out[9]: <torch.autograd.function.MulConstantBackward at 0x229e86cc5e8>
>>> z
Out[10]:
Variable containing:
27 27
27 27
[torch.FloatTensor of size 2x2]
>>> out = z.mean()
>>> out.grad_fn
Out[12]: <torch.autograd.function.MeanBackward at 0x229e86cc408>
>>> out.backward() # 这里因为out为scalar标量,所以参数不需要填写
>>> x.grad
Out[19]:
Variable containing:
4.5000 4.5000
4.5000 4.5000
[torch.FloatTensor of size 2x2]
>>> out # out为标量
Out[20]:
Variable containing:
27
[torch.FloatTensor of size 1]
>>> x = Variable(torch.Tensor([2,2,2]), requires_grad=True)
>>> y = x*2
>>> y
Out[52]:
Variable containing:
4
4
4
[torch.FloatTensor of size 3]
>>> y.backward() # 因为y输出为非标量,求向量间元素的梯度需要对所求的元素进行标注,用相同长度的序列进行标注
Traceback (most recent call last):
File "C:\Users\dell\Anaconda3\envs\my-pytorch\lib\site-packages\IPython\core\interactiveshell.py", line 2862, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-53-95acac9c3254>", line 1, in <module>
y.backward()
File "C:\Users\dell\Anaconda3\envs\my-pytorch\lib\site-packages\torch\autograd\variable.py", line 156, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph, retain_variables)
File "C:\Users\dell\Anaconda3\envs\my-pytorch\lib\site-packages\torch\autograd\__init__.py", line 86, in backward
grad_variables, create_graph = _make_grads(variables, grad_variables, create_graph)
File "C:\Users\dell\Anaconda3\envs\my-pytorch\lib\site-packages\torch\autograd\__init__.py", line 34, in _make_grads
raise RuntimeError("grad can be implicitly created only for scalar outputs")
RuntimeError: grad can be implicitly created only for scalar outputs
>>> y.backward(torch.FloatTensor([0.1, 1, 10]))
>>> x.grad #注意这里的0.1,1.10为梯度求值比例
Out[55]:
Variable containing:
0.2000
2.0000
20.0000
[torch.FloatTensor of size 3]
>>> y.backward(torch.FloatTensor([0.1, 1, 10]))
>>> x.grad # 梯度累积
Out[57]:
Variable containing:
0.4000
4.0000
40.0000
[torch.FloatTensor of size 3]
>>> x.grad.data.zero_() # 梯度累积进行清零
Out[60]:
0
0
0
[torch.FloatTensor of size 3]
>>> x.grad # 累积为空
Out[61]:
Variable containing:
0
0
0
[torch.FloatTensor of size 3]
>>> y.backward(torch.FloatTensor([0.1, 1, 10]))
>>> x.grad
Out[63]:
Variable containing:
0.2000
2.0000
20.0000
[torch.FloatTensor of size 3]