Tensors taking time to shift from HPU to CPU

Aditya_Kulshrestha · June 18, 2024, 9:11pm

Hi,

I am working on TTS. During inferencing the tensors are being shifted to CPU from HPU.
It is taking much more time than the entire inference time.

Attaching the screenshots.

Sayantan_S · June 18, 2024, 9:18pm

you can check some TTS examples here:

Also if inp/output is taking too long, I suggest using HPU graphs

import habana_frameworks.torch as ht
model = ht.hpu.wrap_in_hpu_graph(model)

import habana_frameworks.torch.core as htcore
htcore.hpu.ModuleCacher(max_graphs=10)(model=model, inplace=True)

Robinysh · July 9, 2024, 3:54am

I encountered something similar and uninstalling mpi4py fixed it for me, maybe you can try that.

btw I dont have a minimal reproduce example yet, Ill probably open up an issue for that once I have a proper one.

Topic		Replies	Views
PyTorch model works on CPU/CUDA but not on HPU Training pytorch	5	1753	January 19, 2022
A question about how to use "wrap_in_hpu_graph" Inference pytorch	3	666	April 25, 2023
Training of PyTorch Efficientnet seems extremely slow Training	8	1491	August 23, 2022
AttributeError: module 'habana_frameworks.torch.hpu' has no attribute 'wrap_in_hpu_graph PyTorch	4	119	January 19, 2025
Pytorch Empty Tensor error when running Stable Diffusion on optimum-habana Inference pytorch	9	650	November 14, 2023