Describe the issue; be as descriptive as possible, you can include things like:
• What was the expected behavior:
Install of habanalabs-dkms 1.12.0-480 should succeeds.
root@h001:~# dkms status habanalabs-dkms
habanalabs-dkms/1.11.0-587, 5.15.0-79-generic, x86_64: installed
habanalabs-dkms/1.11.0-587, 5.15.0-86-generic, x86_64: installed
root@h001:~#
• What is the observed result:
root@h001:~# apt upgrade habanalabs-dkms
Reading package lists… Done
Building dependency tree… Done
Reading state information… Done
Calculating upgrade… Done
Get more security updates through Ubuntu Pro with ‘esm-apps’ enabled:
gsasl-common libgsasl7
Learn more about Ubuntu Pro at Ubuntu Pro | Ubuntu
The following packages will be upgraded:
habanalabs-dkms
1 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.
Need to get 0 B/3,194 kB of archives.
After this operation, 0 B of additional disk space will be used.
Do you want to continue? [Y/n] Y
(Reading database … 284391 files and directories currently installed.)
Preparing to unpack …/habanalabs-dkms_1.12.0-480_all.deb …
Error! The module/version combo: habanalabs-dkms-5.15.0-79-generic is not located in the DKMS tree.
dpkg: warning: old habanalabs-dkms package pre-removal script subprocess returned error exit status 3
dpkg: trying script from the new package instead …
Error! The module/version combo: habanalabs-dkms-5.15.0-79-generic is not located in the DKMS tree.
dpkg: error processing archive /var/cache/apt/archives/habanalabs-dkms_1.12.0-480_all.deb (–unpack):
new habanalabs-dkms package pre-removal script subprocess returned error exit status 3
Errors were encountered while processing:
/var/cache/apt/archives/habanalabs-dkms_1.12.0-480_all.deb
E: Sub-process /usr/bin/dpkg returned an error code (1)
• Is the issue consistently reproducible?
• If you are using AWS DL1 instance, please report the AMI name that you are using
What is the Details of the Environment
- Docker or not docker
Baremetal
- Build from source or binary distribution
Source
- OS version: uname -a
root@h001:~# uname -ar
Linux h001 5.15.0-86-generic #96-Ubuntu SMP Wed Sep 20 08:23:49 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
root@h001:~# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 22.04.3 LTS
Release: 22.04
Codename: jammy
- Software versions: (dpkg -l | grep habanalabs)
root@h001:~# dpkg -l | grep habanalabs
ii habanalabs-container-runtime 1.12.0-480 amd64 Habana Labs container runtime. Provides a modified version of runc allowing users to run GPU enabled containers.
ii habanalabs-dkms 1.11.0-587 all habanalabs driver in DKMS format.
ii habanalabs-firmware 1.12.0-480 amd64 Firmware package for Habana Labs processing accelerators
ii habanalabs-firmware-tools 1.12.0-480 amd64 Habanalabs firmware tools package
ii habanalabs-graph 1.12.0-480 amd64 habanalabs graph compiler
ii habanalabs-qual 1.12.0-480 amd64 This package contains Habanalabs qualification package. It designed to assist server vendors to qualify their Goya based server on the production line.
ii habanalabs-rdma-core 1.12.0-480 all Habana Labs rdma-core components.
ii habanalabs-thunk 1.12.0-480 all habanalabs thunk
root@h001:~#
- Python versions used: python –version
root@h001:~# which python python2 python3 python3.8 python3.9 python3.10 python3.11 python3.12
/usr/bin/python
/usr/bin/python3
/usr/bin/python3.8
/usr/bin/python3.9
/usr/bin/python3.10
/usr/bin/python3.11
/usr/bin/python3.12
root@h001:~# /usr/bin/python -V
Python 3.10.12
root@h001:~# /usr/bin/python3 -V
Python 3.10.12
root@h001:~# /usr/bin/python3.8 -V
Python 3.8.18
root@h001:~# /usr/bin/python3.9 -V
Python 3.9.18
root@h001:~# /usr/bin/python3.10 -V
Python 3.10.12
root@h001:~# /usr/bin/python3.11 -V
Python 3.11.5
root@h001:~# /usr/bin/python3.12 -V
Python 3.12.0
- Please attach the dmesg dump, dmesg.log: dmesg > dmesg.log
Will send this if needed.
[ 28.630385] BOOT_IMAGE=images/default-habana-image-u22.04.3LTS/vmlinuz
[ 96.303050] habanalabs_en: loading driver, version: 1.11.0-e6eb0fd
[ 96.314413] habanalabs_cn: loading driver, version: 1.11.0-e6eb0fd
[ 96.765881] habanalabs: loading driver, version: 1.11.0-e6eb0fd
[ 96.766324] habanalabs 0000:b3:00.0: habanalabs device found [1da3:1000] (rev 1)
[ 96.766422] habanalabs 0000:b3:00.0: enabling device (0140 → 0142)
[ 96.766449] habanalabs 0000:b3:00.0: PCI INT A: no GSI - using ISA IRQ 11
[ 96.766568] habanalabs 0000:b4:00.0: habanalabs device found [1da3:1000] (rev 1)
[ 96.766657] habanalabs 0000:b4:00.0: enabling device (0140 → 0142)
[ 96.766674] habanalabs 0000:b4:00.0: PCI INT A: no GSI - using ISA IRQ 11
[ 96.766817] habanalabs 0000:cc:00.0: habanalabs device found [1da3:1000] (rev 1)
[ 96.766913] habanalabs 0000:cc:00.0: enabling device (0140 → 0142)
[ 96.766935] habanalabs 0000:cc:00.0: PCI INT A: no GSI - using ISA IRQ 11
[ 96.767097] habanalabs 0000:cd:00.0: habanalabs device found [1da3:1000] (rev 1)
[ 96.767182] habanalabs 0000:cd:00.0: enabling device (0140 → 0142)
[ 96.767198] habanalabs 0000:cd:00.0: PCI INT A: no GSI - using ISA IRQ 11
[ 96.767898] habanalabs 0000:19:00.0: habanalabs device found [1da3:1000] (rev 1)
[ 96.767993] habanalabs 0000:19:00.0: enabling device (0140 → 0142)
[ 96.768015] habanalabs 0000:19:00.0: PCI INT A: no GSI - using ISA IRQ 11
[ 96.768216] habanalabs 0000:1a:00.0: habanalabs device found [1da3:1000] (rev 1)
[ 96.768300] habanalabs 0000:1a:00.0: enabling device (0140 → 0142)
[ 96.768344] habanalabs 0000:1a:00.0: PCI INT A: no GSI - using ISA IRQ 11
[ 96.768463] habanalabs 0000:34:00.0: habanalabs device found [1da3:1000] (rev 1)
[ 96.768546] habanalabs 0000:34:00.0: enabling device (0140 → 0142)
[ 96.768566] habanalabs 0000:34:00.0: PCI INT A: no GSI - using ISA IRQ 11
[ 96.768746] habanalabs 0000:33:00.0: habanalabs device found [1da3:1000] (rev 1)
[ 96.768828] habanalabs 0000:33:00.0: enabling device (0140 → 0142)
[ 96.768880] habanalabs 0000:33:00.0: PCI INT A: no GSI - using ISA IRQ 11
[ 96.813273] habanalabs hl1: Loading firmware to device, may take some time…
[ 96.813284] habanalabs hl2: Loading firmware to device, may take some time…
[ 96.814278] habanalabs hl0: Loading firmware to device, may take some time…
[ 96.825163] habanalabs hl4: Loading firmware to device, may take some time…
[ 96.829362] habanalabs hl7: Loading firmware to device, may take some time…
[ 96.829371] habanalabs hl6: Loading firmware to device, may take some time…
[ 96.829403] habanalabs hl5: Loading firmware to device, may take some time…
[ 96.829542] habanalabs hl3: Loading firmware to device, may take some time…
[ 96.890488] habanalabs hl1: preboot full version: ‘Preboot version hl-gaudi-1.1.0-fw-32.3.5-sec-4 (Oct 05 2021 - 15:13:16)’
[ 96.890492] habanalabs hl1: BTL version 81608d8d
[ 96.897737] habanalabs hl4: preboot full version: ‘Preboot version hl-gaudi-1.1.0-fw-32.3.5-sec-4 (Oct 05 2021 - 15:13:16)’
[ 96.897741] habanalabs hl4: BTL version 81608d8d
[ 96.924799] habanalabs hl2: preboot full version: ‘Preboot version hl-gaudi-1.1.0-fw-32.3.5-sec-4 (Oct 05 2021 - 15:13:16)’
[ 96.924801] habanalabs hl2: BTL version 81608d8d
[ 96.928466] habanalabs hl7: preboot full version: ‘Preboot version hl-gaudi-1.1.0-fw-32.3.5-sec-4 (Oct 05 2021 - 15:13:16)’
[ 96.928469] habanalabs hl7: BTL version 81608d8d
[ 96.962904] habanalabs hl5: preboot full version: ‘Preboot version hl-gaudi-1.1.0-fw-32.3.5-sec-4 (Oct 05 2021 - 15:13:16)’
[ 96.962908] habanalabs hl5: BTL version 81608d8d
[ 96.968568] habanalabs hl0: preboot full version: ‘Preboot version hl-gaudi-1.1.0-fw-32.3.5-sec-4 (Oct 05 2021 - 15:13:16)’
[ 96.968571] habanalabs hl0: BTL version 81608d8d
[ 97.006410] habanalabs hl6: preboot full version: ‘Preboot version hl-gaudi-1.1.0-fw-32.3.5-sec-4 (Oct 05 2021 - 15:13:16)’
[ 97.006415] habanalabs hl6: BTL version 81608d8d
[ 97.023141] habanalabs hl3: preboot full version: ‘Preboot version hl-gaudi-1.1.0-fw-32.3.5-sec-4 (Oct 05 2021 - 15:13:16)’
[ 97.023145] habanalabs hl3: BTL version 81608d8d
[ 105.021285] habanalabs hl6: boot-fit version 32.6.6-sec-4
[ 105.023217] habanalabs hl4: boot-fit version 32.6.6-sec-4
[ 105.025166] habanalabs hl5: boot-fit version 32.6.6-sec-4
[ 105.027096] habanalabs hl7: boot-fit version 32.6.6-sec-4
[ 105.027488] habanalabs hl1: boot-fit version 32.6.6-sec-4
[ 105.047550] habanalabs hl0: boot-fit version 32.6.6-sec-4
[ 105.049999] habanalabs hl3: boot-fit version 32.6.6-sec-4
[ 105.060031] habanalabs hl2: boot-fit version 32.6.6-sec-4
[ 106.213870] habanalabs hl5: Successfully loaded firmware to device
[ 106.215847] habanalabs hl6: Successfully loaded firmware to device
[ 106.217787] habanalabs hl4: Successfully loaded firmware to device
[ 106.219695] habanalabs hl7: Successfully loaded firmware to device
[ 106.227616] habanalabs hl1: Successfully loaded firmware to device
[ 106.245196] habanalabs hl3: Successfully loaded firmware to device
[ 106.247146] habanalabs hl0: Successfully loaded firmware to device
[ 106.253212] habanalabs hl2: Successfully loaded firmware to device
[ 108.792332] habanalabs hl4: Linux version 32.6.6-sec-4
[ 108.806251] habanalabs hl1: Linux version 32.6.6-sec-4
[ 108.811276] habanalabs hl6: Linux version 32.6.6-sec-4
[ 108.815292] habanalabs hl5: Linux version 32.6.6-sec-4
[ 108.824322] habanalabs hl7: Linux version 32.6.6-sec-4
[ 108.847297] habanalabs hl0: Linux version 32.6.6-sec-4
[ 108.855284] habanalabs hl2: Linux version 32.6.6-sec-4
[ 108.867287] habanalabs hl3: Linux version 32.6.6-sec-4
[ 108.902232] habanalabs hl1: Found GAUDI device with 32GB DRAM
[ 108.906129] habanalabs hl6: Found GAUDI device with 32GB DRAM
[ 108.906206] habanalabs hl4: Found GAUDI device with 32GB DRAM
[ 108.919322] habanalabs hl5: Found GAUDI device with 32GB DRAM
[ 108.928341] habanalabs hl7: Found GAUDI device with 32GB DRAM
[ 108.939263] habanalabs hl0: Found GAUDI device with 32GB DRAM
[ 108.959347] habanalabs hl2: Found GAUDI device with 32GB DRAM
[ 108.964283] habanalabs hl3: Found GAUDI device with 32GB DRAM
[ 110.173847] habanalabs 0000:b4:00.0 enp180s0d8: renamed from eth1
[ 110.178545] habanalabs hl1: Privileged security enabled
[ 110.178677] habanalabs hl1: hwmon7: add sensors information
[ 110.178680] habanalabs hl1: Successfully added device 0000:b4:00.0 to habanalabs driver
[ 110.198996] habanalabs 0000:b4:00.0 enp180s0d1: renamed from eth0
[ 110.225655] habanalabs 0000:b4:00.0 enp180s0d9: renamed from eth2
[ 110.267872] habanalabs 0000:33:00.0 enp51s0d1: renamed from eth3
[ 110.274537] habanalabs hl5: Privileged security enabled
[ 110.274860] habanalabs hl5: hwmon8: add sensors information
[ 110.274863] habanalabs hl5: Successfully added device 0000:1a:00.0 to habanalabs driver
[ 110.279285] habanalabs hl4: Privileged security enabled
[ 110.279432] habanalabs hl4: hwmon9: add sensors information
[ 110.279434] habanalabs hl4: Successfully added device 0000:19:00.0 to habanalabs driver
[ 110.279703] habanalabs hl3: Privileged security enabled
[ 110.279865] habanalabs hl3: hwmon10: add sensors information
[ 110.279868] habanalabs hl3: Successfully added device 0000:cd:00.0 to habanalabs driver
[ 110.282860] habanalabs hl7: Privileged security enabled
[ 110.282966] habanalabs hl7: hwmon11: add sensors information
[ 110.282969] habanalabs hl7: Successfully added device 0000:33:00.0 to habanalabs driver
[ 110.284286] habanalabs hl2: Privileged security enabled
[ 110.284667] habanalabs hl2: hwmon12: add sensors information
[ 110.284671] habanalabs hl2: Successfully added device 0000:cc:00.0 to habanalabs driver
[ 110.286349] habanalabs hl6: Privileged security enabled
[ 110.286454] habanalabs hl6: hwmon13: add sensors information
[ 110.286456] habanalabs hl6: Successfully added device 0000:34:00.0 to habanalabs driver
[ 110.309629] habanalabs 0000:33:00.0 enp51s0d8: renamed from eth6
[ 110.354589] habanalabs 0000:1a:00.0 ens2d1: renamed from eth4
[ 110.386776] habanalabs 0000:34:00.0 enp52s0d1: renamed from eth5
[ 110.399872] habanalabs hl0: Privileged security enabled
[ 110.400164] habanalabs hl0: hwmon14: add sensors information
[ 110.400170] habanalabs hl0: Successfully added device 0000:b3:00.0 to habanalabs driver
[ 110.432086] habanalabs 0000:34:00.0 enp52s0d8: renamed from eth0
[ 110.542245] habanalabs 0000:1a:00.0 ens2d8: renamed from eth7
[ 110.685681] habanalabs 0000:1a:00.0 ens2d9: renamed from eth9
[ 110.735132] habanalabs 0000:33:00.0 enp51s0d9: renamed from eth8
[ 110.790130] habanalabs 0000:34:00.0 enp52s0d9: renamed from eth11
[ 110.837609] habanalabs 0000:cd:00.0 enp205s0d9: renamed from eth19
[ 110.897824] habanalabs 0000:19:00.0 ens1d9: renamed from eth10
[ 110.936948] habanalabs 0000:b3:00.0 enp179s0d1: renamed from eth12
[ 110.957693] habanalabs 0000:b3:00.0 enp179s0d9: renamed from eth3
[ 110.990932] habanalabs 0000:cd:00.0 enp205s0d8: renamed from eth18
[ 111.020557] habanalabs 0000:19:00.0 ens1d8: renamed from eth2
[ 111.064814] habanalabs 0000:cc:00.0 enp204s0d8: renamed from eth15
[ 111.095429] habanalabs 0000:cd:00.0 enp205s0d1: renamed from eth13
[ 111.114285] habanalabs 0000:cc:00.0 enp204s0d1: renamed from eth14
[ 111.145577] habanalabs 0000:cc:00.0 enp204s0d9: renamed from eth16
[ 111.170649] habanalabs 0000:b3:00.0 enp179s0d8: renamed from eth17
[ 111.212608] habanalabs 0000:19:00.0 ens1d1: renamed from eth1
If Bare Metal, please share the current Habana release version and Firmware version by running this command: sudo hl-smi -q
root@h001:~# sudo hl-smi
±----------------------------------------------------------------------------+
| HL-SMI Version: hl-1.12.0-fw-46.0.2.0 |
| Driver Version: 1.11.0-e6eb0fd |
|-------------------------------±---------------------±---------------------+
| AIP Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | AIP-Util Compute M. |
|===============================+======================+======================|
| 0 HL-205 N/A | 0000:b3:00.0 N/A | 0 |
| N/A 32C N/A 104W / 350W | 512MiB / 32768MiB | 15% N/A |
|-------------------------------±---------------------±---------------------+
| 1 HL-205 N/A | 0000:b4:00.0 N/A | 0 |
| N/A 29C N/A 102W / 350W | 512MiB / 32768MiB | 14% N/A |
|-------------------------------±---------------------±---------------------+
| 2 HL-205 N/A | 0000:cc:00.0 N/A | 0 |
| N/A 30C N/A 103W / 350W | 512MiB / 32768MiB | 14% N/A |
|-------------------------------±---------------------±---------------------+
| 3 HL-205 N/A | 0000:cd:00.0 N/A | 0 |
| N/A 37C N/A 103W / 350W | 512MiB / 32768MiB | 15% N/A |
|-------------------------------±---------------------±---------------------+
| 4 HL-205 N/A | 0000:19:00.0 N/A | 0 |
| N/A 33C N/A 96W / 350W | 512MiB / 32768MiB | 12% N/A |
|-------------------------------±---------------------±---------------------+
| 5 HL-205 N/A | 0000:1a:00.0 N/A | 0 |
| N/A 33C N/A 106W / 350W | 512MiB / 32768MiB | 15% N/A |
|-------------------------------±---------------------±---------------------+
| 6 HL-205 N/A | 0000:34:00.0 N/A | 0 |
| N/A 35C N/A 104W / 350W | 512MiB / 32768MiB | 15% N/A |
|-------------------------------±---------------------±---------------------+
| 7 HL-205 N/A | 0000:33:00.0 N/A | 0 |
| N/A 31C N/A 100W / 350W | 512MiB / 32768MiB | 13% N/A |
|-------------------------------±---------------------±---------------------+
| Compute Processes: AIP Memory |
| AIP PID Type Process name Usage |
|=============================================================================|
| 0 N/A N/A N/A N/A |
| 1 N/A N/A N/A N/A |
| 2 N/A N/A N/A N/A |
| 3 N/A N/A N/A N/A |
| 4 N/A N/A N/A N/A |
| 5 N/A N/A N/A N/A |
| 6 N/A N/A N/A N/A |
| 7 N/A N/A N/A N/A |
+=============================================================================+
root@h001:~#
Will send ‘hl-smi -q’ if needed.