Deep Learning

torch.Tensor에 대하여

* torch.Tensor

(n-dimensional array)

* torch.backward(gradient, retain_graph, create_graph)

- 인수(기본적으로 1x1 텐서)를 루트 텐서로부터 추적가능한 모든 리프노드까지 backward graph를 통해 전달하면서, 실제로 gradient를 계산

* torch.no_grad()

- test/추론할 때 유용

- autograd 작동을 비활성화함

- memory 사용량을 줄이고, 계산 속도를 빠르게 함

- backpropagation할 수 없음 (테스팅 코드에서 원하지 않는 내용을 backpropagation할 수 없음)

vs model.eval()

- model.eval() # <-> model.train()

: 모든 layer에 testing 모드인 것을 알림

: 실제로 batchnorm이나 dropout 계층이 training 대신 eval에서 작동하도록 유도함

* detach()

- 연결된 두 네트워크를 사용하지만, loss로부터 detach된 지점을 넘어 back propagate되지 않길 원할 때

- GAN에서 fake image generator와 discriminator이라는 두 개의 네트워크가 있는데, discriminator만 훈련하고 싶을 때 detach 사용!

- A Network -> point X -> B Network -> Loss

: B Network만 학습시키고 싶을 경우(backward), X에서 detach (X는 leaf node가 됨)

* grad

- 기본값은 None

- backward()시 self의 gradient를 계산한 Tensor가 됨

- 계산된 gradient를 포함하여, 이후 호출에서 backward()에 대한 gradient를 누적함.

* is_leaf

- requires_grad가 False인 Tensor는 보통 leaf Tensor가 됨

- reauires_grad가 True이고, 사용자가 직접 만든 경우 leaf Tensor가 됨. 이 때는 연산의 결과가 아니므로 grad_fn이 None

- leaf Tensor만이 backward()시 grad를 구성함

- leaf Tensor가 아닌 Tensor에 대해 grad를 구성하려면, retain_grad를 사용

- Computational Graph의 leaf node라는 뜻

- Computational Graph에 포함되지 않는 node 역시 leaf node

* retain_graph

- mini-batch와 같이, 한 iteration에서 여러 번 backpropagate해야 할 때 유용

- 기본적으로 backpropagate하면 computational graph는 파괴됨

- BP를 여러번 하려면(backward() 호출 여러번) computational graph를 유지해야 함

- BP를 언제 여러번 함?

1. loss function이 여러개여서 bp 여러번해야할 때~

2. NN의 head가 여러개일 때~

3. GAN에서~

- input과 output 사이에 모델이 두개가 있고, 각 모델을 지날 때마다 loss function을 확인하는 구조의 network라면, 첫 loss에서 retain_graph=True를 하는 방법이 있고, total_loss = loss1 + loss2를 해서 total_loss를 backward하는 방법이 있다.

- optimizaer를 달리 쓰고 싶거나, step-size(or learning rate)를 조정하고 싶을 때

* requires_grad

- 기본값은 False이다!

- Tensor에 대해 gradient를 계산해야 하는 경우 True (autograd이므로 자동으로 계산)

- Tensor에 대해 gradient를 계산하더라도, grad 속성을 populate하지는 않음.

- requires_grad와 is_leaf가 동시에 True인 경우에만 grad attribute가 populate됨.

- requires_grad를 직접 설정할 때는 언제인가?

1. 사전 훈련된 모델을 그냥 사용하고 싶을 때(업데이트 없이)

: 모델의 일부를 고정할 때 유용

2. 입력값 자체를 backpropagating할 때 (Adversarial과 같은 특수한 케이스에 사용 - input이 스스로 update)

** 어떤 텐서가 만들어졌는지 기록을 유지해야 하는데, requires_grad가 설정된 텐서는 스스로 존재하는 (weight같은) 사용자 정의 텐서가 있고, 나머지 하나는 연산의 결과로 생성되는 중간 텐서임. 근데 중간 텐선들은 gradient update해줄 필요가 없음!

### Forward 과정 ###

# a : backpropagation이 기록되어야 하고, 자손들이 모두 requires_grad를 이어받음!

# a == weight

a = torch.tensor(2.0, requires_grad = True)

# a #

# data = tensor(2.0)

# grad = None

# grad_fn = None

# is_leaf = True

# requires_grad = True

# b == input

b = torch.tensor(3.0)

# b #

# data = tensor(3.0)

# grad = None

# grad_fn = None

# is_leaf = True

# requires_grad = False

c = a*b

# c #

# data = tensor(6.0)

# grad = None

# grad_fn = MulBackward

# is_leaf = False

# requires_grad = True

### Dynamically Created Computational Graph in PyTorch ###

W_h = torch.randn(20, 20, requires_grad = True)

W_x = torch.randn(20, 10, requires_grad = True)

x = torch.randn(1, 10)

prev_h = torch.randn(1, 20)

# torch.mm() == 행렬곱

# torch.t() == 전치

h2h = torch.mm(W_h, prev_h.t())

# W_h는 20X20이고, prev_h는 1X20이기 때문에 행렬곱을 하려면 prev_h를 20X1으로 만들어야 함

i2h = torch.mm(W_x, x.t())

# W_x는 20X10이고, x는 1X10이기 때문에, 행렬곱을 하려면 x를 10X1으로 만들어야 함

next_h = h2h + i2h

next_h = next_h.tanh()

loss = next_h.sum()

# graph 완성!

loss.backward() # compute gradients

### Backward example ###

import torch

x = torch.tensor(1.0, requires_grad=True)

z = x ** 3

z.backward(torch.tensor(3.0))

# (z' = 3* x**2, z'(3) = 3 * (3.0)**2)

print(x.grad)

# dz/dx = 3 x**2, x=1.0, x.grad == torch.tensor(9.0) <-- 1X1 unit tensor

### no_grad example ###

import torch

x = torch.tensor(1.0, requires_grad = True)

y = x*2

# y.requires_grad = True

# computational graph가 아니고, 입력 데이터를 캐싱하지 않음

with torch.no_grad():

y = x*2

# y.requires_grad = False

### detach example ###

x = torch.tensor(1.0, requires_grad = True)

print(x.requires_grad) # True

y = x.detach()

print(y.requires_grad) # False

print(x.eq(y).all()) # Tensor(True) - tensor의 모든 요소가 같음

저작자표시 비영리 동일조건

'Deep Learning' 카테고리의 다른 글

[draft] Cross Entropy Loss (0)	2021.04.05
[draft] Hinge Loss (0)	2021.04.05
[draft] Loss Function (0)	2021.04.05
[draft] data preparation (0)	2021.03.29
[draft] training mode with respect to dataset (0)	2021.03.29

Contents

새소식

torch.Tensor에 대하여

'Deep Learning' 카테고리의 다른 글

당신이 좋아할만한 콘텐츠

티스토리툴바