These pre-trained word embeddings come from the [Glove project](https://nlp.stanford.edu/projects/glove/). For more details, about how the embeddings were generated see [this paper](https://nlp.stanford.edu/pubs/glove.pdf).
%% Cell type:code id: tags:
``` python
args=Namespace(
glove_filename='../data/glove.6B.100d.txt'
)
```
%% Cell type:code id: tags:
``` python
defload_word_vectors(filename):
"""
A helper function to load word vectors from a file.
We can use view to add size-1 dimensions, which can be useful for combining with other tensors. This is called broadcasting.
%% Cell type:code id: tags:
``` python
x=torch.arange(12).view(3,4)
y=torch.arange(4).view(1,4)
z=torch.arange(3).view(3,1)
print(x)
print(y)
print(z)
print(x+y)
print(x+z)
```
%% Output
tensor([[ 0., 1., 2., 3.],
[ 4., 5., 6., 7.],
[ 8., 9., 10., 11.]])
tensor([[ 0., 1., 2., 3.]])
tensor([[ 0.],
[ 1.],
[ 2.]])
tensor([[ 0., 2., 4., 6.],
[ 4., 6., 8., 10.],
[ 8., 10., 12., 14.]])
tensor([[ 0., 1., 2., 3.],
[ 5., 6., 7., 8.],
[ 10., 11., 12., 13.]])
%% Cell type:markdown id: tags:
Unsqueeze and squeeze will add and remove 1-dimensions.
%% Cell type:code id: tags:
``` python
x=torch.arange(12).view(3,4)
print(x.shape)
x=x.unsqueeze(dim=1)
print(x.shape)
x=x.squeeze()
print(x.shape)
```
%% Output
torch.Size([3, 4])
torch.Size([3, 1, 4])
torch.Size([3, 4])
%% Cell type:markdown id: tags:
all of the standard mathematics operations apply (such as `add` below)
%% Cell type:code id: tags:
``` python
x=torch.rand(3,4)
print("x: \n",x)
print("--")
print("torch.add(x, x): \n",torch.add(x,x))
print("--")
print("x+x: \n",x+x)
```
%% Output
x:
tensor([[ 0.6662, 0.3343, 0.7893, 0.3216],
[ 0.5247, 0.6688, 0.8436, 0.4265],
[ 0.9561, 0.0770, 0.4108, 0.0014]])
--
torch.add(x, x):
tensor([[ 1.3324, 0.6686, 1.5786, 0.6433],
[ 1.0494, 1.3377, 1.6872, 0.8530],
[ 1.9123, 0.1540, 0.8216, 0.0028]])
--
x+x:
tensor([[ 1.3324, 0.6686, 1.5786, 0.6433],
[ 1.0494, 1.3377, 1.6872, 0.8530],
[ 1.9123, 0.1540, 0.8216, 0.0028]])
%% Cell type:markdown id: tags:
The convention of `_` indicating in-place operations continues:
%% Cell type:code id: tags:
``` python
x=torch.arange(12).reshape(3,4)
print(x)
print(x.add_(x))
```
%% Output
tensor([[ 0., 1., 2., 3.],
[ 4., 5., 6., 7.],
[ 8., 9., 10., 11.]])
tensor([[ 0., 2., 4., 6.],
[ 8., 10., 12., 14.],
[ 16., 18., 20., 22.]])
%% Cell type:markdown id: tags:
There are many operations for which reduce a dimension. Such as sum:
%% Cell type:code id: tags:
``` python
x=torch.arange(12).reshape(3,4)
print("x: \n",x)
print("---")
print("Summing across rows (dim=0): \n",x.sum(dim=0))
print("---")
print("Summing across columns (dim=1): \n",x.sum(dim=1))
```
%% Output
x:
tensor([[ 0., 1., 2., 3.],
[ 4., 5., 6., 7.],
[ 8., 9., 10., 11.]])
---
Summing across rows (dim=0):
tensor([ 12., 15., 18., 21.])
---
Summing across columns (dim=1):
tensor([ 6., 22., 38.])
%% Cell type:markdown id: tags:
#### Indexing, Slicing, Joining and Mutating
%% Cell type:code id: tags:
``` python
x=torch.arange(6).view(2,3)
print("x: \n",x)
print("---")
print("x[:2, :2]: \n",x[:2,:2])
print("---")
print("x[0][1]: \n",x[0][1])
print("---")
print("Setting [0][1] to be 8")
x[0][1]=8
print(x)
```
%% Output
x:
tensor([[ 0., 1., 2.],
[ 3., 4., 5.]])
---
x[:2, :2]:
tensor([[ 0., 1.],
[ 3., 4.]])
---
x[0][1]:
tensor(1.)
---
Setting [0][1] to be 8
tensor([[ 0., 8., 2.],
[ 3., 4., 5.]])
%% Cell type:markdown id: tags:
We can select a subset of a tensor using the `index_select`
%% Cell type:code id: tags:
``` python
x=torch.arange(9).view(3,3)
print(x)
print("---")
indices=torch.LongTensor([0,2])
print(torch.index_select(x,dim=0,index=indices))
print("---")
indices=torch.LongTensor([0,2])
print(torch.index_select(x,dim=1,index=indices))
```
%% Output
tensor([[ 0., 1., 2.],
[ 3., 4., 5.],
[ 6., 7., 8.]])
---
tensor([[ 0., 1., 2.],
[ 6., 7., 8.]])
---
tensor([[ 0., 2.],
[ 3., 5.],
[ 6., 8.]])
%% Cell type:markdown id: tags:
We can also use numpy-style advanced indexing:
%% Cell type:code id: tags:
``` python
x=torch.arange(9).view(3,3)
indices=torch.LongTensor([0,2])
print(x[indices])
print("---")
print(x[indices,:])
print("---")
print(x[:,indices])
```
%% Output
tensor([[ 0., 1., 2.],
[ 6., 7., 8.]])
---
tensor([[ 0., 1., 2.],
[ 6., 7., 8.]])
---
tensor([[ 0., 2.],
[ 3., 5.],
[ 6., 8.]])
%% Cell type:markdown id: tags:
We can combine tensors by concatenating them. First, concatenating on the rows
%% Cell type:code id: tags:
``` python
x=torch.arange(6).view(2,3)
describe(x)
describe(torch.cat([x,x],dim=0))
describe(torch.cat([x,x],dim=1))
describe(torch.stack([x,x]))
```
%% Output
Type: torch.FloatTensor
Shape/size: torch.Size([2, 3])
Values:
tensor([[ 0., 1., 2.],
[ 3., 4., 5.]])
Type: torch.FloatTensor
Shape/size: torch.Size([4, 3])
Values:
tensor([[ 0., 1., 2.],
[ 3., 4., 5.],
[ 0., 1., 2.],
[ 3., 4., 5.]])
Type: torch.FloatTensor
Shape/size: torch.Size([2, 6])
Values:
tensor([[ 0., 1., 2., 0., 1., 2.],
[ 3., 4., 5., 3., 4., 5.]])
Type: torch.FloatTensor
Shape/size: torch.Size([2, 2, 3])
Values:
tensor([[[ 0., 1., 2.],
[ 3., 4., 5.]],
[[ 0., 1., 2.],
[ 3., 4., 5.]]])
%% Cell type:markdown id: tags:
We can concentate along the first dimension.. the columns.
%% Cell type:code id: tags:
``` python
x=torch.arange(9).view(3,3)
print(x)
print("---")
new_x=torch.cat([x,x,x],dim=1)
print(new_x.shape)
print(new_x)
```
%% Output
tensor([[ 0., 1., 2.],
[ 3., 4., 5.],
[ 6., 7., 8.]])
---
torch.Size([3, 9])
tensor([[ 0., 1., 2., 0., 1., 2., 0., 1., 2.],
[ 3., 4., 5., 3., 4., 5., 3., 4., 5.],
[ 6., 7., 8., 6., 7., 8., 6., 7., 8.]])
%% Cell type:markdown id: tags:
We can also concatenate on a new 0th dimension to "stack" the tensors:
%% Cell type:code id: tags:
``` python
x=torch.arange(9).view(3,3)
print(x)
print("---")
new_x=torch.stack([x,x,x])
print(new_x.shape)
print(new_x)
```
%% Output
tensor([[ 0., 1., 2.],
[ 3., 4., 5.],
[ 6., 7., 8.]])
---
torch.Size([3, 3, 3])
tensor([[[ 0., 1., 2.],
[ 3., 4., 5.],
[ 6., 7., 8.]],
[[ 0., 1., 2.],
[ 3., 4., 5.],
[ 6., 7., 8.]],
[[ 0., 1., 2.],
[ 3., 4., 5.],
[ 6., 7., 8.]]])
%% Cell type:markdown id: tags:
#### Linear Algebra Tensor Functions
%% Cell type:markdown id: tags:
Transposing allows you to switch the dimensions to be on different axis. So we can make it so all the rows are columsn and vice versa.
%% Cell type:code id: tags:
``` python
x=torch.arange(0,12).view(3,4)
print("x: \n",x)
print("---")
print("x.tranpose(1, 0): \n",x.transpose(1,0))
```
%% Output
x:
tensor([[ 0., 1., 2., 3.],
[ 4., 5., 6., 7.],
[ 8., 9., 10., 11.]])
---
x.tranpose(1, 0):
tensor([[ 0., 4., 8.],
[ 1., 5., 9.],
[ 2., 6., 10.],
[ 3., 7., 11.]])
%% Cell type:markdown id: tags:
A three dimensional tensor would represent a batch of sequences, where each sequence item has a feature vector. It is common to switch the batch and sequence dimensions so that we can more easily index the sequence in a sequence model.
Note: Transpose will only let you swap 2 axes. Permute (in the next cell) allows for multiple
In this small snippet, you can see the gradient computations at work. We create a tensor and multiply it by 3. Then, we create a scalar output using `sum()`. A Scalar output is needed as the the loss variable. Then, called backward on the loss means it computes its rate of change with respect to the inputs. Since the scalar was created with sum, each position in z and x are independent with respect to the loss scalar.
The rate of change of x with respect to the output is just the constant 3 that we multiplied x by.
~/anaconda3/envs/pytorch04/lib/python3.6/site-packages/torch/autograd/__init__.py in _make_grads(outputs, grads)
25 if out.requires_grad:
26 if out.numel() != 1:
---> 27 raise RuntimeError("grad can be implicitly created only for scalar outputs")
28 new_grads.append(torch.ones_like(out))
29 else:
RuntimeError: grad can be implicitly created only for scalar outputs
%% Cell type:markdown id: tags:
Making the output a scalar:
%% Cell type:code id: tags:
``` python
x=torch.tensor([1.0,0.5],requires_grad=True)
y=f(x)
y.sum().backward()
print(x.grad)
```
%% Output
tensor([ 0.5403, 0.8776])
%% Cell type:markdown id: tags:
but there was an issue.. this isn't right for this edge case:
%% Cell type:code id: tags:
``` python
x=torch.tensor([1.0,-1],requires_grad=True)
y=f(x)
y.sum().backward()
print(x.grad)
```
%% Output
tensor([-0.8415, 0.8415])
%% Cell type:code id: tags:
``` python
x=torch.tensor([-0.5,-1],requires_grad=True)
y=f(x)
y.sum().backward()
print(x.grad)
```
%% Output
tensor([ 0.4794, 0.8415])
%% Cell type:markdown id: tags:
This is because we aren't doing the boolean computation and subsequent application of cos and sin on an elementwise basis. So, to solve this, it is common to use masking:
RuntimeError: Expected object of type torch.FloatTensor but found type torch.cuda.FloatTensor for argument #3 'other'
%% Cell type:markdown id: tags:
### Exercises
Some of these exercises require operations not covered in the notebook. You will have to look at [the documentation](https://pytorch.org/docs/) (on purpose!)
(Answers are at the bottom)
%% Cell type:markdown id: tags:
#### Exercise 1
Create a 2D tensor and then add a dimension of size 1 inserted at the 0th axis.
%% Cell type:code id: tags:
``` python
```
%% Cell type:markdown id: tags:
#### Exercise 2
Remove the extra dimension you just added to the previous tensor.
%% Cell type:code id: tags:
``` python
```
%% Cell type:markdown id: tags:
#### Exercise 3
Create a random tensor of shape 5x3 in the interval [3, 7)
%% Cell type:code id: tags:
``` python
```
%% Cell type:markdown id: tags:
#### Exercise 4
Create a tensor with values from a normal distribution (mean=0, std=1).
%% Cell type:code id: tags:
``` python
```
%% Cell type:markdown id: tags:
#### Exercise 5
Retrieve the indexes of all the non zero elements in the tensor torch.Tensor([1, 1, 1, 0, 1]).
%% Cell type:code id: tags:
``` python
```
%% Cell type:markdown id: tags:
#### Exercise 6
Create a random tensor of size (3,1) and then horizonally stack 4 copies together.
%% Cell type:code id: tags:
``` python
```
%% Cell type:markdown id: tags:
#### Exercise 7
Return the batch matrix-matrix product of two 3 dimensional matrices (a=torch.rand(3,4,5), b=torch.rand(3,5,4)).
%% Cell type:code id: tags:
``` python
```
%% Cell type:markdown id: tags:
#### Exercise 8
Return the batch matrix-matrix product of a 3D matrix and a 2D matrix (a=torch.rand(3,4,5), b=torch.rand(5,4)).
%% Cell type:code id: tags:
``` python
```
%% Cell type:markdown id: tags:
Answers below
%% Cell type:code id: tags:
``` python
```
%% Cell type:code id: tags:
``` python
```
%% Cell type:code id: tags:
``` python
```
%% Cell type:code id: tags:
``` python
```
%% Cell type:code id: tags:
``` python
```
%% Cell type:code id: tags:
``` python
```
%% Cell type:code id: tags:
``` python
```
%% Cell type:markdown id: tags:
Answers still below.. Keep Going
%% Cell type:code id: tags:
``` python
```
%% Cell type:code id: tags:
``` python
```
%% Cell type:code id: tags:
``` python
```
%% Cell type:code id: tags:
``` python
```
%% Cell type:code id: tags:
``` python
```
%% Cell type:code id: tags:
``` python
```
%% Cell type:markdown id: tags:
#### Exercise 1
Create a 2D tensor and then add a dimension of size 1 inserted at the 0th axis.
%% Cell type:code id: tags:
``` python
a = torch.rand(3,3)
a = a.unsqueeze(0)
print(a)
print(a.shape)
```
%% Output
tensor([[[ 0.7077, 0.4189, 0.0655],
[ 0.8839, 0.8083, 0.7528],
[ 0.8988, 0.6839, 0.7658]]])
torch.Size([1, 3, 3])
%% Cell type:markdown id: tags:
#### Exercise 2
Remove the extra dimension you just added to the previous tensor.
%% Cell type:code id: tags:
``` python
a = a.squeeze(0)
print(a.shape)
```
%% Output
torch.Size([3, 3])
%% Cell type:markdown id: tags:
#### Exercise 3
Create a random tensor of shape 5x3 in the interval [3, 7)
%% Cell type:code id: tags:
``` python
3 + torch.rand(5, 3) * 4
```
%% Output
tensor([[ 6.6597, 4.5970, 3.4402],
[ 4.0164, 4.7330, 4.7802],
[ 4.9864, 6.1461, 5.6416],
[ 3.5212, 4.3992, 4.5295],
[ 6.2172, 4.2744, 4.1632]])
%% Cell type:markdown id: tags:
#### Exercise 4
Create a tensor with values from a normal distribution (mean=0, std=1).
%% Cell type:code id: tags:
``` python
a = torch.rand(3,3)
a.normal_(mean=0, std=1)
```
%% Output
tensor([[-0.2107, 1.1399, -2.5122],
[ 1.3823, 0.9847, 1.4719],
[ 0.3100, 1.5829, 0.2351]])
%% Cell type:markdown id: tags:
#### Exercise 5
Retrieve the indexes of all the non zero elements in the tensor torch.Tensor([1, 1, 1, 0, 1]).
%% Cell type:code id: tags:
``` python
a = torch.Tensor([1, 1, 1, 0, 1])
torch.nonzero(a)
```
%% Output
tensor([[ 0],
[ 1],
[ 2],
[ 4]])
%% Cell type:markdown id: tags:
#### Exercise 6
Create a random tensor of size (3,1) and then horizonally stack 4 copies together.
%% Cell type:code id: tags:
``` python
a = torch.rand(3,1)
a.expand(3,4)
```
%% Output
tensor([[ 0.7595, 0.7595, 0.7595, 0.7595],
[ 0.5311, 0.5311, 0.5311, 0.5311],
[ 0.6449, 0.6449, 0.6449, 0.6449]])
%% Cell type:markdown id: tags:
#### Exercise 7
Return the batch matrix-matrix product of two 3 dimensional matrices (a=torch.rand(3,4,5), b=torch.rand(3,5,4)).
%% Cell type:code id: tags:
``` python
a = torch.rand(3,4,5)
b = torch.rand(3,5,4)
torch.bmm(a, b)
```
%% Output
tensor([[[ 1.7768, 1.5815, 1.7667, 0.9918],
[ 0.7049, 0.7050, 0.6055, 0.3455],
[ 1.5937, 1.3627, 1.6757, 1.0042],
[ 1.2478, 0.9978, 0.8067, 1.1299]],
[[ 1.4816, 1.4685, 1.7443, 1.6224],
[ 0.8311, 1.1861, 1.2165, 0.9788],
[ 1.3339, 1.3306, 1.7348, 1.3621],
[ 1.3177, 1.5979, 1.5706, 1.3298]],
[[ 1.8358, 0.8323, 1.2206, 1.3237],
[ 2.0028, 1.5150, 1.5610, 1.2854],
[ 2.2775, 1.6230, 1.9977, 1.8435],
[ 2.5799, 1.6463, 1.8448, 1.7839]]])
%% Cell type:markdown id: tags:
#### Exercise 8
Return the batch matrix-matrix product of a 3D matrix and a 2D matrix (a=torch.rand(3,4,5), b=torch.rand(5,4)).