ValueError: invalid type: 'torch.hpu.FloatTensor'

I am trying to train YOLOX algorithm on gaudi2. I am getting above error at this operation.
grid = torch.stack((xv, yv), 2).view(1, 1, hsize, wsize, 2).type(dtype)

How can I solve it?

Thanks for posting.

Can you try:
grid = torch.stack((xv, yv), 2).view(1, 1, hsize, wsize, 2).type(torch.FloatTensor)

Thanks. That helped to get pass previous error. now getting following error.

[1,0]: File “/workspace/yolox/utils/”, line 102, in bboxes_iou
[1,0]: en = (tl < br).type(torch.FloatTensor).prod(dim=2)
[1,0]: │ │ │ └ <class ‘torch.FloatTensor’>
[1,0]: │ │ └ <module ‘torch’ from ‘/usr/local/lib/python3.8/dist-packages/torch/’>
[1,0]: │ └
[1,0]: └
[1,0]:RuntimeError: [05-30 15:25:23::141879][R000][41919]FATAL ERROR :: MODULE:SYNHELPER node add failed 1

Can you point me to which code you are using? We have a yolox enabled here. Also, what is the SW version you have?


I am trying to integrate horovod into Yolox code and analyzed that when I change optimizer code

        self.optimizer = self.exp.get_optimizer(self.args.batch_size, self.args.hpu)


        self.optimizer = self.exp.get_optimizer(self.args.batch_size, self.args.hpu)

        self.optimizer = hvd.DistributedOptimizer(self.optimizer, named_parameters=model.named_parameters())

        hvd.broadcast_parameters(model.state_dict(), root_rank=0)
        hvd.broadcast_optimizer_state(self.optimizer, root_rank=0)

I started getting error
ValueError: Tensor type torch.hpu.FloatTensor is not supported.

Also I am using 8 gaudi2 and
docker image


Below I have referenced line where error is happening.

From this post, it seems you have been able to run the yolox model?

No. This error comes when I try to train using horovod instead of pytorch distributed module. Integrating horovod in original_yolox and running it on A100 working fine, but getting mentioned error when I run with gaudi2. post mentioned by you is using code yolox_gaudi2.

We support DDP for scaling in pytorch not horovod. Horovod is supported only on tensorflow. See here and here.