Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

- Community Forums
- :
- Blogs
- :
- Design and Debug Techniques Blog
- :
- Vitis AI - How tensors allow for efficent memory u...

- Subscribe to RSS Feed
- Mark as New
- Mark as Read
- Bookmark
- Subscribe
- Email to a Friend
- Printer Friendly Page
- Report Inappropriate Content

07-08-2020
06:50 AM

In data manipulation, it is typical to reshape or reorder the original data and create multiple copies. At any new step, a new copy is created. As the program grows, so does the occupied memory, and I seldom think to worry about this issue until an Out Of Memory error happens.

The amazing thing about tensors is that multiple tensors can refer to the same storage (a contiguous chunk of memory containing a given type of numbers). This is managed by torch.storage.

Each tensor has the .storage property which shows us the tensor content as it is stored in the memory.

In the next article I will write about the even more amazing Tensor property of tracking the ancestors operations, but here I will mostly focus on memory optimization.

In [1]:

import torch

a = torch.randint(0, 9, (5,3))

a

Out[1]:

tensor([[4, 1, 6], [0, 8, 8], [1, 2, 1], [0, 5, 7], [0, 0, 7]])

In [2]:

`a.storage() `

Out[2]:

4 1 6 0 8 8 1 2 1 0 5 7 0 0 7 [torch.LongStorage of size 15]

In [3]:

`a.shape`

Out[3]:

torch.Size([5, 3])

We might need to transpose and flatten the original "a" tensor.

Why should we waste double the memory for the same data, even with a different shape?

In [4]:

`b = torch.transpose(a, 0, 1)`

b

Out[4]:

tensor([[4, 0, 1, 0, 0], [1, 8, 2, 5, 0], [6, 8, 1, 7, 7]])

**a** and **b** are indeed tensors pointing to the same storage.

They are represented differently because we tell them to read from the storage in a different order, using the stride function.

**b** has a stride equal to (1,3) meaning that when reading the storage it must jump to the next row every 1 element, and jump to next column every 3 elements.

In [5]:

`b.stride(), a.stride()`

Out[5]:

((1, 3), (3, 1))

We can access the data from **a**, **b**, or directly from the original storage.

However if we access from the storage the value read is no longer a tensor.

In [6]:

`a[1,2], b[2,1], a.storage()[5], b.storage()[5] `

Out[6]:

(tensor(8), tensor(8), 8, 8)

Now, one odd thing that drove me crazy was when I found that my tensors magically changed in value by themselves:

When changing **a**, **b** is also modified.

In [7]:

`a[0,0] = 10`

b[0,0]

Out[7]:

tensor(10)

This happens because from a memory point of view tensors are an ordered representation of a storage.

Two tensors generated from the same storage are **not** independent and I must remember that every time I change one tensor, all other tensors pointing to the same storage are also modified.

An efficient usage of the memory has its drawbacks!

Subsets

Subset of the original data still make an efficient usage of the memory.

The new tensor will still point to a subset of the original storage.

In [8]:

`c = a[0:2, 0:2]`

c

Out[8]:

tensor([[10, 1], [ 0, 8]])

In [9]:

`c[0,0]=77`

a

Out[9]:

tensor([[77, 1, 6], [ 0, 8, 8], [ 1, 2, 1], [ 0, 5, 7], [ 0, 0, 7]])

Inplace operations

In [10]:

`a.zero_()`

b

Out[10]:

tensor([[0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0]])

When we really need an independent new tensor, we can clone it.

A new storage will also be created.

In [11]:

`a_clone = a.clone()`

a_clone[0,0] = 55

a_clone

Out[11]:

tensor([[55, 0, 0], [ 0, 0, 0], [ 0, 0, 0], [ 0, 0, 0], [ 0, 0, 0]])

In [12]:

`a`

Out[12]:

tensor([[0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0]])

Some functions only work on contiguous tensors.

When we transposed **a,** we generated a new tensor by allocating matrix values in **b** from the storage that are not contiguous.

In [13]:

`a.is_contiguous()`

Out[13]:

True

In [14]:

`b.is_contiguous()`

Out[14]:

False

**b** contiguous, but this will generate a new reshuffled storage from **b,** and **a** and **b** will be forever independent:

In [15]:

`b = b.contiguous()`

b[0,0] = 18

a[0,0]

Out[15]:

tensor(0)

In [16]:

`b.is_contiguous()`

Out[16]:

True

In [17]:

`a.is_contiguous()`

Out[17]:

True

You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.

Latest Articles

- How to debug the Versal PLM in Vitis
- Flop Inferrence in Verilog Synthesis
- Differences between Designing with UltraScale+ CMA...
- AI Engine Series 6 - Analyzing AI Engine compilati...
- RQS Design closure Suggestion ID "RQS_CLOCK-12"
- Debugging Versal ACAP Integrated Block for PCIe Ex...
- Adding a Debug ILA to the MicroBlaze Instruction T...
- AI Engine Series 5 - Running the AIE Compiler targ...
- Zynq UltraScale+ RFSoC Gen3: Programming the CLK10...
- AI Engine Series 4 - First run of the AI Engine co...