Gaudi2 PyTorch Container - Device acquire failed

Describe the issue; be as descriptive as possible, you can include things like:

On a devcloud Gaudi2 instance, we the PyTorch image appears to not have Gaudi2 HPU support properly installed.

What is the Details of the Environment

Docker command: docker run -it -it --runtime=habana -e HABANA_VISIBLE_DEVICES=all -e OMPI_MCA_btl_vader_single_copy_mechanism=none --cap-add=sys_nice --net=host vault.habana.ai/gaudi-docker/1.8.0/ubuntu20.04/habanalabs/pytorch-installer-1.13.1 /bin/bash

Simple python test:

root@devcloud:/# python3.8
Python 3.8.10 (default, Nov 14 2022, 12:59:47) 
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> device = torch.device("hpu")
>>> tens = torch.rand(3)
>>> tens
tensor([0.1079, 0.9648, 0.6298])
>>> tens.to(device)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
RuntimeError: PyTorch is not linked with support for hpu devices
>>> quit()

hl-smi in docker

root@devcloud:/# hl-smi
±----------------------------------------------------------------------------+
| HL-SMI Version: hl-1.8.0-fw-40.0.0.2 |
| Driver Version: 1.7.1-68c1a21 |
|-------------------------------±---------------------±---------------------+
| AIP Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | AIP-Util Compute M. |
|===============================+======================+======================|
| 0 HL-225 N/A | 0000:33:00.0 N/A | 0 |
| N/A 32C N/A 107W / 600W | 768MiB / 98304MiB | 3% N/A |
|-------------------------------±---------------------±---------------------+
| 1 HL-225 N/A | 0000:34:00.0 N/A | 0 |
| N/A 35C N/A 96W / 600W | 768MiB / 98304MiB | 0% N/A |
|-------------------------------±---------------------±---------------------+
| 2 HL-225 N/A | 0000:4d:00.0 N/A | 0 |
| N/A 36C N/A 105W / 600W | 768MiB / 98304MiB | 4% N/A |
|-------------------------------±---------------------±---------------------+
| 3 HL-225 N/A | 0000:4e:00.0 N/A | 0 |
| N/A 32C N/A 110W / 600W | 768MiB / 98304MiB | 3% N/A |
|-------------------------------±---------------------±---------------------+
| 4 HL-225 N/A | 0000:b3:00.0 N/A | 0 |
| N/A 37C N/A 105W / 600W | 768MiB / 98304MiB | 2% N/A |
|-------------------------------±---------------------±---------------------+
| 5 HL-225 N/A | 0000:9b:00.0 N/A | 0 |
| N/A 38C N/A 100W / 600W | 768MiB / 98304MiB | 1% N/A |
|-------------------------------±---------------------±---------------------+
| 6 HL-225 N/A | 0000:b4:00.0 N/A | 0 |
| N/A 33C N/A 110W / 600W | 768MiB / 98304MiB | 3% N/A |
|-------------------------------±---------------------±---------------------+
| 7 HL-225 N/A | 0000:9a:00.0 N/A | 0 |
| N/A 34C N/A 98W / 600W | 768MiB / 98304MiB | 1% N/A |
|-------------------------------±---------------------±---------------------+
| Compute Processes: AIP Memory |
| AIP PID Type Process name Usage |
|=============================================================================|
| 0 N/A N/A N/A N/A |
| 1 N/A N/A N/A N/A |
| 2 N/A N/A N/A N/A |
| 3 N/A N/A N/A N/A |
| 4 N/A N/A N/A N/A |
| 5 N/A N/A N/A N/A |
| 6 N/A N/A N/A N/A |
| 7 N/A N/A N/A N/A |
+=============================================================================+

Is there something else we need to run after docker launch? Is this not the right image for Gaudi2 PyTorch?

For enabling HPU, you need to import the library as shown here.

import habana_frameworks.torch.core as htcore

once you do that you should be able to run: tens.to(device)