AttributeError : 'HabanaParameterWrapper' object has no attribute 'change_device_placement'

When I try to copy the weights from layers that are already on hpu to cpu, this error occurs.
Code :

import os
os.environ['PT_HPU_LAZY_MODE'] = '1'
os.environ['LOG_LEVEL_PT_FALLBACK']='1'
os.environ['PT_HPU_ENABLE_REFINE_DYNAMIC_SHAPES']='1'
os.environ['LOG_LEVEL_ALL']='3'
os.environ['ENABLE_CONSOLE']='true'
import habana_frameworks.torch.core as htcore
import torch
with torch.no_grad():

Error:

[05:22:59.397349][SYN_API       ][info ][tid:4AD0] + ---------------------------------------------------------------------- +
[05:22:59.397377][SYN_API       ][info ][tid:4AD0] | Version:            1.17.1                                             |
[05:22:59.397383][SYN_API       ][info ][tid:4AD0] | Synapse:            db1a431                                            |
[05:22:59.397387][SYN_API       ][info ][tid:4AD0] | HCL:                1.17.1-a6d0341                                     |
[05:22:59.397391][SYN_API       ][info ][tid:4AD0] | MME:                f1ec30d                                            |
[05:22:59.397395][SYN_API       ][info ][tid:4AD0] | SCAL:               b05f1cf                                            |
[05:22:59.397400][SYN_API       ][info ][tid:4AD0] | Description:        HabanaLabs Runtime and GraphCompiler               |
[05:22:59.397430][SYN_API       ][info ][tid:4AD0] | Time:               2024-09-23 05:22:59.397404                         |
[05:22:59.397438][SYN_API       ][info ][tid:4AD0] + ---------------------------------------------------------------------- +
[05:22:59.422401][KERNEL_DB             ][warning][tid:4AD0] Failed loading version number from libTPCFuser.so
[05:22:59.426857][KERNEL_DB             ][warning][tid:4AD0] Failed loading version number from libTPCFuser.so
[05:22:59.429324][KERNEL_DB             ][warning][tid:4AD0] Failed loading version number from libTPCFuser.so
[05:23:00.475872][TPC_NODE              ][warning][tid:4AD0] Can't access halReader, setting maxNumOfTPCS to 24 
[05:23:00.477920][TPC_NODE              ][warning][tid:4AD0] Can't access halReader, setting maxNumOfTPCS to 64 
[05:23:00.485751][SYN_API       ][info ][tid:4AD0] synInitialize, status 0[synSuccess]
[05:23:00.525206][SCAL][info ][tid:4AD0] +-------------------------------------------------+
[05:23:00.525230][SCAL][info ][tid:4AD0] SCAL Commit SHA1 = b05f1cf
[05:23:00.525233][SCAL][info ][tid:4AD0] SCAL Build Time = Tue Aug 20 03:15:55 PM IDT 2024
[05:23:00.525238][SCAL][info ][tid:4AD0] SCAL loading config from default.json
[05:23:00.525244][SCAL][info ][tid:4AD0] SCAL config Hash = 0x2c64b209f6e3077f
[05:23:00.525247][SCAL][info ][tid:4AD0] +-------------------------------------------------+
[05:23:02.206154][HCL       ][info ][tid:4AD0] Version:	1.17.1-a6d0341
[05:23:02.206324][HL_GCFG][warning][tid:4AD0] setValue: override BOX_TYPE_ID value that already set from observation
[05:23:02.946268][SYN_API       ][info ][tid:4AD0] synDeviceAcquireByDeviceType, status 0[synSuccess]
[05:23:02.946734][PT_SYNHELPER    ][warning][tid:4AD0] /npu-stack/pytorch-integration/backend/synapse_helpers/mem_hlml.cpp:117	Allocated hlml shared memory0x7f3ddebae000
============================= HABANA PT BRIDGE CONFIGURATION =========================== 
 PT_HPU_LAZY_MODE = 1
 PT_RECIPE_CACHE_PATH = 
 PT_CACHE_FOLDER_DELETE = 0
 PT_HPU_RECIPE_CACHE_CONFIG = 
 PT_HPU_MAX_COMPOUND_OP_SIZE = 9223372036854775807
 PT_HPU_LAZY_ACC_PAR_MODE = 1
 PT_HPU_ENABLE_REFINE_DYNAMIC_SHAPES = 1
---------------------------: System Configuration :---------------------------
Num CPU Cores : 160
CPU RAM       : 2113407792 KB
------------------------------------------------------------------------------
Traceback (most recent call last):
  File "/root/code/dummy/test.py", line 12, in <module>
    test_conv.weight.data = conv.weight.data
  File "/usr/local/lib/python3.10/dist-packages/habana_frameworks/torch/core/weight_sharing.py", line 32, in __setattr__
    return object.__setattr__(self_, name, value)
  File "/usr/local/lib/python3.10/dist-packages/habana_frameworks/torch/core/weight_sharing.py", line 55, in __torch_function__
    new_args[0].change_device_placement(new_args[1].device)
  File "/usr/local/lib/python3.10/dist-packages/habana_frameworks/torch/core/weight_sharing.py", line 24, in __getattribute__
    return object.__getattribute__(self_, name)
AttributeError: 'HabanaParameterWrapper' object has no attribute 'change_device_placement'
[05:23:04.770297][SYN_API       ][info ][tid:4AD0] synDeviceRelease, status 0[synSuccess]
[05:23:04.779318][SYN_API       ][info ][tid:4AD0] synDestroy, status 0[synSuccess]

System info :

================================================================================
Environment
================================================================================
Device                               gaudi2 
OS                                   ubuntu 
OS version                           22.04  
Log file                             /var/log/habana_logs/install-2024-09-23-14-25-50.log
Release version                      1.17.0-495
Habanalabs server                    vault.habana.ai
Rewrite installer config             no     
Install type                         validate
Python repo URL                      https://vault.habana.ai/artifactory/api/pypi/gaudi-python/simple
Habanalabs software                  [OK]   
habanalabs-container-runtime=1.17.1-40
habanalabs-dkms=1.17.0-495
habanalabs-firmware=1.17.0-495
habanalabs-firmware-odm=1.17.0-495
habanalabs-firmware-tools=1.17.0-495
habanalabs-graph=1.17.0-495
habanalabs-qual=1.17.0-495
habanalabs-rdma-core=1.17.0-495
habanalabs-thunk=1.17.0-495
================================================================================
System
================================================================================
CPU: 160
Model name: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz
MemTotal: 2113407792 kB
Hugepagesize: 2048 kB
================================================================================
OS environment
================================================================================
Package manager                      [apt]  
The required sudo privileges are     [ok]   
Python 3.10                          [OK]   
================================================================================
Basic dependencies
================================================================================
gcc                                  [OK]   
cmake                                [OK]   
lsof                                 [OK]   
curl                                 [OK]   
wget                                 [OK]   
linux-headers-5.15.0-113-generic     [OK]   
ethtool                              [OK]   
libelf-dev                           [OK]   
libbz2-dev                           [OK]   
liblzma-dev                          [OK]   
libibverbs-dev                       [OK]   
librdmacm-dev                        [OK]   
dkms                                 [OK]   
linux-modules-extra-5.15.0-113-generic  [OK]   
================================================================================
PyTorch dependencies
================================================================================
gcc                                  [OK]   
cmake                                [OK]   
lsof                                 [OK]   
curl                                 [OK]   
wget                                 [OK]   
unzip                                [MISSING]
libcurl4                             [OK]   
moreutils                            [MISSING]
iproute2                             [OK]   
libcairo2-dev                        [MISSING]
libglib2.0-dev                       [MISSING]
libselinux1-dev                      [MISSING]
libnuma-dev                          [MISSING]
libpcre2-dev                         [MISSING]
libatlas-base-dev                    [MISSING]
libjpeg-dev                          [MISSING]
liblapack-dev                        [MISSING]
libnuma-dev                          [MISSING]
google-perftools                     [MISSING]
numactl                              [MISSING]
libopenblas-dev                      [MISSING]
Open MPI version 4.1.5 will be install
================================================================================
Installed Habanalabs software
================================================================================
habanalabs-container-runtime=1.17.1-40
habanalabs-dkms=1.17.0-495
habanalabs-firmware=1.17.0-495
habanalabs-firmware-odm=1.17.0-495
habanalabs-firmware-tools=1.17.0-495
habanalabs-graph=1.17.0-495
habanalabs-qual=1.17.0-495
habanalabs-rdma-core=1.17.0-495
habanalabs-thunk=1.17.0-495
================================================================================
Full install log: /var/log/habana_logs/install-2024-09-23-14-25-50.log
================================================================================

The code ends with “with torch.no_grad():”, without the actual code. Could you please post what follows torch.no_grad?

sorry, I accidentally posted without code. here’s the code snippet to reproduce the error above.

import os
os.environ['PT_HPU_LAZY_MODE'] = '1'
os.environ['LOG_LEVEL_PT_FALLBACK'] = '1'
os.environ['PT_HPU_ENABLE_REFINE_DYNAMIC_SHAPES'] = '1'
os.environ['LOG_LEVEL_ALL'] = '3'
os.environ['ENABLE_CONSOLE'] = 'true'
import habana_frameworks.torch.core as htcore
import torch
with torch.no_grad():
        conv = torch.nn.Conv2d(3,64,(32,32)).to('hpu')
        test_conv = torch.nn.Conv2d(3,64,(32,32))
        test_conv.weight.data = conv.weight.data
output

I tried some workaround to copy the weight in cpu and move it back to hpu, but did not work.

Hi Kwanhee,

Would you be able to try changing line 12
test_conv.weight.data = conv.weight.data → test_conv.weight = conv.weight

import os
os.environ['PT_HPU_LAZY_MODE'] = '1'
os.environ['LOG_LEVEL_PT_FALLBACK'] = '1'
os.environ['PT_HPU_ENABLE_REFINE_DYNAMIC_SHAPES'] = '1'
os.environ['LOG_LEVEL_ALL'] = '3'
os.environ['ENABLE_CONSOLE'] = 'true'
import habana_frameworks.torch.core as htcore
import torch
with torch.no_grad():
        conv = torch.nn.Conv2d(3,64,(32,32)).to('hpu')
        test_conv = torch.nn.Conv2d(3,64,(32,32))
        test_conv.weight = conv.weight


Thanks!

Hi

Can you please describe the usecase? This seems like a specific unit test intended to repro the issue, but if we understood the usecase, we might suggest some workaround

For example, if you want a nnmodule with 2 conv layers to share the same wts, you could create the nnmodule in CPU, copy wts from one layer to the other and then save the state_dict
Next create a new module and load back the statedict and then move the model to hpu

In pseudo code

m0 = Model() # model on cpu
copy m0.conv0 to m0.conv1
save m0.state_dict

m1 = Model()
load saved statedict to m1
m1 = m1.to('hpu')

Or you might be trying to do somethign else, perhaps we can suggest some alternative if we had more context

It is a specific unit test indeed, and i encountered this issue when i was trying to implement network pruning methods.

Normally when we try to prune the weights (usually they are on the GPU(HPU), we copy them and calculate with some metric and replace the weight with pruned weights on the fly, and i was using this code snippet to implement pruning metric.

i think the above solution will workout.

thank you for the reply :slight_smile: !

1 Like

Thank you for the reply. It worked out :smiley: !

1 Like