GigaChat3-Ultra-702B-A36B-preview

GigaChat Created at 19.11.2025 18:11

Ratings for leaderboard tasks

The table will scroll to the left

Task name Result Place in the rating
Agricultural industry 0.645 1
Medicine and healthcare 0.824 2

Information about the submission

Mera version
v1.0.0
Torch Version
2.9.1+cu128
The version of the codebase
435b60a
CUDA version
12.6
Precision of the model weights
bf8
Seed
1234
Batch
1
Transformers version
4.57.1
The number of GPUs and their type
1 x NVIDIA A100 80GB
Architecture
local-chat-completions

Team:

GigaChat

Name of the ML model:

GigaChat3-Ultra-702B-A36B-preview

Model size

702.0B

Model type:

Opened

SFT

MoE

Architecture description:

Представляем `GigaChat3-Ultra-702B-A36B-preview` — инструктивную (instruct) модель семейства GigaChat. Модель основана на архитектуре Mixture-of-Experts (MoE) с 702B общих и 36B активных параметров. Архитектура включает **Multi-head Latent Attention (MLA)** и **Multi-Token Prediction (MTP)**, за счет этого модель оптимизированна для высокой пропускной способности (throughput) при инференсе.

License:

MIT