GigaChat-3.1-Ultra

GigaChat Created at 28.03.2026 08:28

0.361

The overall result

Ratings for leaderboard tasks

The table will scroll to the left

Task name	Result	Metric
YABLoCo	0.024 / 0.019	EM pass@k
stRuCom	0.296	chrF
RealCode	0.349 / 0.971	pass@k execution_success
UnitTests	0.245	CodeBLEU
ruCodeEval	0.542 / 0.612 / 0.628	pass@k
JavaTestGen	0.159 / 0.449	pass@k compile@1
ruHumanEval	0.539 / 0.578 / 0.585	pass@k
RealCodeJava	0.289 / 0.96	pass@k execution_success
CodeLinterEval	0.489 / 0.507 / 0.518	pass@k
ruCodeReviewer	0.015 / 0.131 / 0.065 / 0.073 / 0.075	chrF BLEU judge@1 judge@5 judge@10
CodeCorrectness	0.728	EM

Information about the submission

Mera version

v1.0.0

Torch Version

2.9.0

The version of the codebase

0ac3a14

CUDA version

12.8

Precision of the model weights

auto

Seed

1234

Batch

Transformers version

4.57.1

The number of GPUs and their type

1 x NVIDIA A100-SXM4-80GB

Architecture

gigachat-completion

Team:

GigaChat

Name of the ML model:

GigaChat-3.1-Ultra

Link to the ML model:

https://huggingface.co/ai-sage/GigaChat3.1-702B-A36B

Model size

715.0B

Model type:

Opened

SFT

Architecture description:

GigaChat 3.1 Ultra is the flagship instruct model of the GigaChat family. It is a large-scale Mixture-of-Experts (MoE) model with 702B total parameters and 36B active parameters, designed for multilingual assistant workloads, reasoning, code, tool use, and large-cluster deployment.

Description of the training:

The model underwent Pretraining, Stage-1.5, SFT and DPO stages.

Pretrain data:

The base GigaChat 3 training corpus spans 10 languages and includes books, academic material, code datasets, and mathematics datasets. All data goes through deduplication, language filtering, and automatic quality checks based on heuristics and classifiers.

License:

MIT

Inference parameters

Generation Parameters:
realcode - do_sample=true;max_gen_toks=4096;temperature=0.7;repetition_penalty=1.05;top_p=0.8;until=["<|endoftext|>","<|im_end|>"]; \nrealcodejava - do_sample=true;max_gen_toks=4096;temperature=0.7;repetition_penalty=1.05;top_p=0.8;until=["<|endoftext|>","<|im_end|>"]; \njavatestgen - do_sample=true;max_gen_toks=4096;temperature=0.2;top_p=0.9;until=["<|endoftext|>","<|im_end|>"]; \nyabloco_oracle - max_gen_toks=2048;do_sample=false;until=["<|endoftext|>","<|im_end|>","\n\n\n","\\sclass\\s","\\sdef\\s","^def\\s","^class\\s","^if\\s","@","^#"]; \nunittests - do_sample=false;max_gen_toks=1024;until=["\n\n"]; \ncodecorrectness - until=["\n\n"];do_sample=false;temperature=0; \ncodelintereval - do_sample=true;temperature=0.6;max_gen_toks=1024;until=["\n\n"]; \nrucodereviewer - temperature=0;do_sample=false;max_gen_toks=1000;until=["\n\n"]; \nstrucom - do_sample=false;max_gen_toks=512;until=["\n\n"]; \nrucodeeval_code - do_sample=true;temperature=0.6;max_gen_toks=1024;until=["\nclass","\ndef","\n#","\nif","\nprint"]; \nruhumaneval_code - do_sample=true;temperature=0.6;max_gen_toks=1024;until=["\nclass","\ndef","\n#","\nif","\nprint"];

GigaChat-3.1-Ultra

Ratings for leaderboard tasks

Information about the submission

Team:

Name of the ML model:

Link to the ML model:

Model size

Model type:

Architecture description:

Description of the training:

Pretrain data:

License:

Inference parameters

Confirm the deletion of the sub