When I try to copy the weights from layers that are already on hpu to cpu, this error occurs.
Code :
import os
os.environ['PT_HPU_LAZY_MODE'] = '1'
os.environ['LOG_LEVEL_PT_FALLBACK']='1'
os.environ['PT_HPU_ENABLE_REFINE_DYNAMIC_SHAPES']='1'
os.environ['LOG_LEVEL_ALL']='3'
os.environ['ENABLE_CONSOLE']='true'
import habana_frameworks.torch.core as htcore
import torch
with torch.no_grad():
Error:
[05:22:59.397349][SYN_API ][info ][tid:4AD0] + ---------------------------------------------------------------------- +
[05:22:59.397377][SYN_API ][info ][tid:4AD0] | Version: 1.17.1 |
[05:22:59.397383][SYN_API ][info ][tid:4AD0] | Synapse: db1a431 |
[05:22:59.397387][SYN_API ][info ][tid:4AD0] | HCL: 1.17.1-a6d0341 |
[05:22:59.397391][SYN_API ][info ][tid:4AD0] | MME: f1ec30d |
[05:22:59.397395][SYN_API ][info ][tid:4AD0] | SCAL: b05f1cf |
[05:22:59.397400][SYN_API ][info ][tid:4AD0] | Description: HabanaLabs Runtime and GraphCompiler |
[05:22:59.397430][SYN_API ][info ][tid:4AD0] | Time: 2024-09-23 05:22:59.397404 |
[05:22:59.397438][SYN_API ][info ][tid:4AD0] + ---------------------------------------------------------------------- +
[05:22:59.422401][KERNEL_DB ][warning][tid:4AD0] Failed loading version number from libTPCFuser.so
[05:22:59.426857][KERNEL_DB ][warning][tid:4AD0] Failed loading version number from libTPCFuser.so
[05:22:59.429324][KERNEL_DB ][warning][tid:4AD0] Failed loading version number from libTPCFuser.so
[05:23:00.475872][TPC_NODE ][warning][tid:4AD0] Can't access halReader, setting maxNumOfTPCS to 24
[05:23:00.477920][TPC_NODE ][warning][tid:4AD0] Can't access halReader, setting maxNumOfTPCS to 64
[05:23:00.485751][SYN_API ][info ][tid:4AD0] synInitialize, status 0[synSuccess]
[05:23:00.525206][SCAL][info ][tid:4AD0] +-------------------------------------------------+
[05:23:00.525230][SCAL][info ][tid:4AD0] SCAL Commit SHA1 = b05f1cf
[05:23:00.525233][SCAL][info ][tid:4AD0] SCAL Build Time = Tue Aug 20 03:15:55 PM IDT 2024
[05:23:00.525238][SCAL][info ][tid:4AD0] SCAL loading config from default.json
[05:23:00.525244][SCAL][info ][tid:4AD0] SCAL config Hash = 0x2c64b209f6e3077f
[05:23:00.525247][SCAL][info ][tid:4AD0] +-------------------------------------------------+
[05:23:02.206154][HCL ][info ][tid:4AD0] Version: 1.17.1-a6d0341
[05:23:02.206324][HL_GCFG][warning][tid:4AD0] setValue: override BOX_TYPE_ID value that already set from observation
[05:23:02.946268][SYN_API ][info ][tid:4AD0] synDeviceAcquireByDeviceType, status 0[synSuccess]
[05:23:02.946734][PT_SYNHELPER ][warning][tid:4AD0] /npu-stack/pytorch-integration/backend/synapse_helpers/mem_hlml.cpp:117 Allocated hlml shared memory0x7f3ddebae000
============================= HABANA PT BRIDGE CONFIGURATION ===========================
PT_HPU_LAZY_MODE = 1
PT_RECIPE_CACHE_PATH =
PT_CACHE_FOLDER_DELETE = 0
PT_HPU_RECIPE_CACHE_CONFIG =
PT_HPU_MAX_COMPOUND_OP_SIZE = 9223372036854775807
PT_HPU_LAZY_ACC_PAR_MODE = 1
PT_HPU_ENABLE_REFINE_DYNAMIC_SHAPES = 1
---------------------------: System Configuration :---------------------------
Num CPU Cores : 160
CPU RAM : 2113407792 KB
------------------------------------------------------------------------------
Traceback (most recent call last):
File "/root/code/dummy/test.py", line 12, in <module>
test_conv.weight.data = conv.weight.data
File "/usr/local/lib/python3.10/dist-packages/habana_frameworks/torch/core/weight_sharing.py", line 32, in __setattr__
return object.__setattr__(self_, name, value)
File "/usr/local/lib/python3.10/dist-packages/habana_frameworks/torch/core/weight_sharing.py", line 55, in __torch_function__
new_args[0].change_device_placement(new_args[1].device)
File "/usr/local/lib/python3.10/dist-packages/habana_frameworks/torch/core/weight_sharing.py", line 24, in __getattribute__
return object.__getattribute__(self_, name)
AttributeError: 'HabanaParameterWrapper' object has no attribute 'change_device_placement'
[05:23:04.770297][SYN_API ][info ][tid:4AD0] synDeviceRelease, status 0[synSuccess]
[05:23:04.779318][SYN_API ][info ][tid:4AD0] synDestroy, status 0[synSuccess]
System info :
================================================================================
Environment
================================================================================
Device gaudi2
OS ubuntu
OS version 22.04
Log file /var/log/habana_logs/install-2024-09-23-14-25-50.log
Release version 1.17.0-495
Habanalabs server vault.habana.ai
Rewrite installer config no
Install type validate
Python repo URL https://vault.habana.ai/artifactory/api/pypi/gaudi-python/simple
Habanalabs software [OK]
habanalabs-container-runtime=1.17.1-40
habanalabs-dkms=1.17.0-495
habanalabs-firmware=1.17.0-495
habanalabs-firmware-odm=1.17.0-495
habanalabs-firmware-tools=1.17.0-495
habanalabs-graph=1.17.0-495
habanalabs-qual=1.17.0-495
habanalabs-rdma-core=1.17.0-495
habanalabs-thunk=1.17.0-495
================================================================================
System
================================================================================
CPU: 160
Model name: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz
MemTotal: 2113407792 kB
Hugepagesize: 2048 kB
================================================================================
OS environment
================================================================================
Package manager [apt]
The required sudo privileges are [ok]
Python 3.10 [OK]
================================================================================
Basic dependencies
================================================================================
gcc [OK]
cmake [OK]
lsof [OK]
curl [OK]
wget [OK]
linux-headers-5.15.0-113-generic [OK]
ethtool [OK]
libelf-dev [OK]
libbz2-dev [OK]
liblzma-dev [OK]
libibverbs-dev [OK]
librdmacm-dev [OK]
dkms [OK]
linux-modules-extra-5.15.0-113-generic [OK]
================================================================================
PyTorch dependencies
================================================================================
gcc [OK]
cmake [OK]
lsof [OK]
curl [OK]
wget [OK]
unzip [MISSING]
libcurl4 [OK]
moreutils [MISSING]
iproute2 [OK]
libcairo2-dev [MISSING]
libglib2.0-dev [MISSING]
libselinux1-dev [MISSING]
libnuma-dev [MISSING]
libpcre2-dev [MISSING]
libatlas-base-dev [MISSING]
libjpeg-dev [MISSING]
liblapack-dev [MISSING]
libnuma-dev [MISSING]
google-perftools [MISSING]
numactl [MISSING]
libopenblas-dev [MISSING]
Open MPI version 4.1.5 will be install
================================================================================
Installed Habanalabs software
================================================================================
habanalabs-container-runtime=1.17.1-40
habanalabs-dkms=1.17.0-495
habanalabs-firmware=1.17.0-495
habanalabs-firmware-odm=1.17.0-495
habanalabs-firmware-tools=1.17.0-495
habanalabs-graph=1.17.0-495
habanalabs-qual=1.17.0-495
habanalabs-rdma-core=1.17.0-495
habanalabs-thunk=1.17.0-495
================================================================================
Full install log: /var/log/habana_logs/install-2024-09-23-14-25-50.log
================================================================================