Does anyone know when/if Optimum Habana will support Mixtral models? We tried some hacks, but…
- Updating transformers breaks the “adapt transformers” functionality provided by Optimum (which is needed for better performance in my understanding)
- Not updating transformers creates an issue in reading the Mixtral model config (which is only supported by recent versions of transformers)
We are able to load and use Mistral 7B models and fine-tunes of those. Just not Mixtral MoE models.