Huawei openPangu-Ultra-MoE-718B-model No Nvidia tech used to build this model - only Ascend AI chips, CANN not CUDA. "... trained from scratch on Ascend NPU, with a total parameter count of 718B and 39B activated parameters per token. The openPangu-Ultra-MoE-718B is trained on approximately 19 trillion tokens, and equipped with the capability to switch between fast and slow thinking." Seems to be a state of the art model if not the best open model currently available. Link below - you can download it and use it
26,54K