I was not able to use hpu_backend with torch.compile(model, backend="hpu_backend")
. It says the backend is not available. I believe I have correctly installed the dependencies using the command HABANALABS_VIRTUAL_DIR='./.venv' habanalabs-installer.sh install --type pytorch --venv
but I am not entirely sure.
Here is the error stack trace:
File "/home/user/.venv/lib/python3.10/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 92, in _call_target
return _target_(*args, **kwargs)
File "/home/user/src/models/modules/anygpt.py", line 161, in constructor
"anygpt": torch.compile(model, backend="hpu_backend"),
File "/home/user/.venv/lib/python3.10/site-packages/lightning/fabric/wrappers.py", line 380, in _capture
compiled_model = compile_fn(model, **kwargs)
File "/home/user/.venv/lib/python3.10/site-packages/torch/__init__.py", line 1818, in compile
backend = _TorchCompileWrapper(backend, mode, options, dynamic)
File "/home/user/.venv/lib/python3.10/site-packages/torch/__init__.py", line 1686, in __init__
self.compiler_fn = lookup_backend(backend)
File "/home/user/.venv/lib/python3.10/site-packages/torch/_dynamo/backends/registry.py", line 62, in lookup_backend
raise InvalidBackend(name=compiler_fn)
torch._dynamo.exc.InvalidBackend: Invalid backend: 'hpu_backend', see `torch._dynamo.list_backends()` for available backends.
Running torch._dynamo.list_backends()
indeed doesnt have the hpu_backend.
>>> import habana_frameworks.torch.core as htcore
/home/user/.venv/lib/python3.10/site-packages/transformers/utils/generic.py:485: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
_torch_pytree._register_pytree_node(
>>> import torch
>>> torch._dynamo.list_backends()
['cudagraphs', 'inductor', 'onnxrt', 'openxla', 'openxla_eval', 'tvm']
And here are some information about my system:
$ HABANALABS_VIRTUAL_DIR='./.venv' habanalabs-installer.sh validate
================================================================================
Environment
================================================================================
Device gaudi2
OS ubuntu
OS version 22.04
Log file /var/log/habana_logs/install-2024-06-23-19-11-48.log
Release version 1.15.0-479
Habanalabs server vault.habana.ai
Rewrite installer config no
Install type validate
Python repo URL https://vault.habana.ai/artifactory/api/pypi/gaudi-python/simple
Habanalabs software [OK]
habanalabs-container-runtime=1.15.0-479
habanalabs-dkms=1.15.0-479
habanalabs-firmware=1.16.1-7
habanalabs-firmware-tools=1.15.0-479
habanalabs-graph=1.15.0-479
habanalabs-qual=1.15.0-479
habanalabs-rdma-core=1.15.0-479
habanalabs-thunk=1.15.0-479
================================================================================
System
================================================================================
CPU: 160
Model name: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz
MemTotal: 1056375280 kB
Hugepagesize: 2048 kB
================================================================================
OS environment
================================================================================
Package manager [apt]
The required sudo privileges are [FAILED]
Python 3.10 [OK]
================================================================================
Basic dependencies
================================================================================
gcc [OK]
cmake [OK]
lsof [OK]
curl [OK]
wget [OK]
linux-headers-5.15.0-92-generic [OK]
ethtool [OK]
libelf-dev [OK]
libbz2-dev [OK]
liblzma-dev [OK]
libibverbs-dev [OK]
librdmacm-dev [OK]
dkms [OK]
linux-modules-extra-5.15.0-92-generic [OK]
================================================================================
PyTorch dependencies
================================================================================
gcc [OK]
cmake [OK]
lsof [OK]
curl [OK]
wget [OK]
unzip [OK]
libcurl4 [OK]
moreutils [OK]
iproute2 [OK]
libcairo2-dev [OK]
libglib2.0-dev [OK]
libselinux1-dev [OK]
libnuma-dev [OK]
libpcre2-dev [OK]
libatlas-base-dev [OK]
libjpeg-dev [OK]
liblapack-dev [OK]
libnuma-dev [OK]
google-perftools [OK]
numactl [OK]
libopenblas-dev [OK]
The installed version is: 4.1.5
================================================================================
Installed Habanalabs software
================================================================================
habanalabs-container-runtime=1.15.0-479
habanalabs-dkms=1.15.0-479
habanalabs-firmware=1.16.1-7
habanalabs-firmware-tools=1.15.0-479
habanalabs-graph=1.15.0-479
habanalabs-qual=1.15.0-479
habanalabs-rdma-core=1.15.0-479
habanalabs-thunk=1.15.0-479
================================================================================
Full install log: /var/log/habana_logs/install-2024-06-23-19-11-48.log
================================================================================