Inference

FAQ This section contains a list of frequently asked questions that will be of interest to many users

Topic	Replies	Views	Activity
About the Inference category Inference	0	883	December 21, 2020
Current best inference server implementation for Gaudi2 Inference pytorch , performance , models	3	474	January 2, 2025
Where can I get more information on Habana’s first-generation inference processor, previously known as Goya? FAQ	1	960	December 3, 2024
Llama inference result with infinite eot_id tokens Inference	4	219	December 3, 2024
Graph compile failed when torch.repeat Inference pytorch	3	139	November 3, 2024
FP8 range for E4M3 dtype FAQ pytorch	3	368	September 4, 2024
What is --enforce-eager Inference	3	1353	July 30, 2024
Tensors taking time to shift from HPU to CPU Inference pytorch	2	147	July 9, 2024
Running optimum-habana sample on gaudi FAQ pytorch	2	279	June 27, 2024
LangChain: Optimum Habana Examples Text-Generation Inference	3	280	June 4, 2024
Does Gaudi2 lib support Mixtral-8x7b? Inference models	1	170	March 29, 2024
Dose habana support Mixtral-8x7b? Inference	1	182	March 29, 2024
why tensorflow support dopped in release 1.15 Inference	0	201	March 29, 2024
Graph compile failed error when running txt2image.py from Habana Model-References repo Inference	3	408	November 28, 2023
Pytorch Empty Tensor error when running Stable Diffusion on optimum-habana Inference pytorch	9	690	November 14, 2023
Missing Results for LLaMA2 on Gaudi2 Inference	0	413	August 16, 2023
A question about how to use "wrap_in_hpu_graph" Inference pytorch	3	683	April 25, 2023
Strange results with torch.randn - is it really giving normal distributed tensor? Inference pytorch	8	2596	November 14, 2022
Performance data (latency) for VGG16 layer-by-layer inference with Goya Inference	3	1022	August 4, 2021