When running a simple tensorflow script:
import tensorflow as tf
from habana_frameworks.tensorflow import load_habana_module
load_habana_module()
model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(10),
])
I get the following error:
2021-10-28 08:27:11.109007: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-10-28 08:27:11.243634: F /home/jenkins/workspace/cdsoftwarebuilder/create-tensorflow-module---bpt-d/tensorflow-training/habana_device/habana_device.cpp:121] Device acquire failed. Err: 26
Aborted (core dumped)
I followed this guide to setup an Ubuntu 20.04 AWS DL1 instance: 3. Installation Guide — Gaudi Documentation 1.0.1 documentation
AMI: ubuntu/images/hvm-ssd/ubuntu-focal-20.04-amd64-server-20210430 / ami-09e67e426f25ce0d7
I’m running without docker with the binary distribution.
Linux kernel: Linux ip-172-30-4-24 5.11.0-1020-aws #21~20.04.2-Ubuntu SMP Fri Oct 1 13:03:59 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
Software:
ii habanalabs-aeon 1.1.0-614 amd64 This package contains Habanalabs AEON package.
ii habanalabs-dkms 1.1.0-614 all habanalabs driver in DKMS format.
ii habanalabs-firmware-tools 1.1.0-614 amd64 Habanalabs firmware tools package
ii habanalabs-graph 1.1.0-614 amd64 habanalabs graph compiler
ii habanalabs-qual 1.1.0-614 amd64 This package contains Habanalabs qualification package. It designed to assist server vendors to qualify their Goya based server on the production line.
ii habanalabs-thunk 1.1.0-614 all habanalabs thunk
Python 3.8.10
dmsg.log: [ 0.000000] Linux version 5.11.0-1020-aws (buildd@lcy01-amd64-010) (gcc (Ubun - Pastebin.com
hl-smi output: ================ HL-SMI LOG ================Timestamp : Thu Oct 28 08:53: - Pastebin.com