> nothing comes close to GPT-OSS-120B on Cerebras That's temporary. Cerebras spe...

credit_guy · 2025-12-16T00:23:23 1765844603

That's unlikely. Cerebras doesn't speed up everything. Can it speed up everything? I don't know, I'm not an insider. But does it speed up everything? That is evidently not the case. Their page [1] lists only 4 production models and 2 preview models.

[1] https://inference-docs.cerebras.ai/models/overview

agentastic · 2025-12-17T00:49:35 1765932575

They need to compile the model for their chips. Standard transformers are easier, so GPT-OSS, Qwen, GLM, etc if there is demand, they will deploy it.

Nemotron on the other hand is a hybrid (Transformer + Mamba-2) so it will be more challenging to compile it on Cerebras/Groq chips.

(Me thinks Nvidia is purposefully picking architecture+FP4 that is easy to ship on Nvidia chips, but harder for TPU or Cerebras/Groq to deploy)