GigaChat3-10B-A1.8B

GigaChat Created at 19.11.2025 19:15

Ratings for leaderboard tasks

The table will scroll to the left

Task name Result Place in the rating
Agricultural industry 0.545 3
Medicine and healthcare 0.629 10

Information about the submission

Mera version
1.0.0
Torch Version
2.9.1+cu128
The version of the codebase
435b60a
CUDA version
1
Precision of the model weights
bf8
Seed
1234
Batch
1
Transformers version
4.57.1
The number of GPUs and their type
1 NVIDIA A100-SXM4-80GB
Architecture
local-chat-completions

Team:

GigaChat

Name of the ML model:

GigaChat3-10B-A1.8B

Model size

10.0B

Model type:

Opened

SFT

MoE

Architecture description:

Представляем `GigaChat3-10B-A1.8B` — инструктивную (instruct) модель семейства GigaChat. Модель основана на архитектуре Mixture-of-Experts (MoE) с 10B общих и 1.8B активных параметров. Архитектура включает **Multi-head Latent Attention (MLA)** и **Multi-Token Prediction (MTP)**, за счет этого модель оптимизированна для высокой пропускной способности (throughput) при инференсе.

License:

MIT