Support for Mixtral - Optimum Habana

Does anyone know when/if Optimum Habana will support Mixtral models? We tried some hacks, but…

  • Updating transformers breaks the “adapt transformers” functionality provided by Optimum (which is needed for better performance in my understanding)
  • Not updating transformers creates an issue in reading the Mixtral model config (which is only supported by recent versions of transformers)

We are able to load and use Mistral 7B models and fine-tunes of those. Just not Mixtral MoE models.

Optimum habana doesn’t support mixtral as of the latest release

@dwhitena

Just to clarify, what’s the usecase you are looking for?
Is it inference, finetune (if so, what dataset/task for finetuning) or training?

Thanks
Sayantan