GigaChat-20B-A3B

GIGACHAT Created at 13.12.2024 07:11
0.513
The overall result
87
Place in the rating
In the top by tasks:
9
ruModAr
The task is one of the main ones
Weak tasks:
300
RWSD
178
PARus
230
RCB
179
ruEthics
182
MultiQ
187
ruWorldTree
155
ruOpenBookQA
82
CheGeKa
181
ruMMLU
123
ruHateSpeech
181
ruDetox
151
ruHHH
106
ruTiE
181
ruHumanEval
34
USE
146
MathLogicQA
184
ruMultiAr
277
SimpleAr
412
LCS
234
BPS
112
MaMuRAMu
153
ruCodeEval
+18
Hide

Ratings for leaderboard tasks

The table will scroll to the left

Task name Result Metric
LCS 0.07 Accuracy
RCB 0.518 / 0.441 Accuracy F1 macro
USE 0.334 Grade norm
RWSD 0.512 Accuracy
PARus 0.842 Accuracy
ruTiE 0.758 Accuracy
MultiQ 0.393 / 0.187 F1 Exact match
CheGeKa 0.318 / 0.252 F1 Exact match
ruModAr 0.87 Exact match
MaMuRAMu 0.741 Accuracy
ruMultiAr 0.272 Exact match
ruCodeEval 0.041 / 0.054 / 0.061 Pass@k
MathLogicQA 0.455 Accuracy
ruWorldTree 0.901 / 0.901 Accuracy F1 macro
ruOpenBookQA 0.833 / 0.833 Accuracy F1 macro

Evaluation on open tasks:

Go to the ratings by subcategory

The table will scroll to the left

Task name Result Metric
BPS 0.921 Accuracy
ruMMLU 0.587 Accuracy
SimpleAr 0.923 Exact match
ruHumanEval 0.037 / 0.04 / 0.043 Pass@k
ruHHH 0.73
ruHateSpeech 0.777
ruDetox 0.191
ruEthics
Correct God Ethical
Virtue 0.299 0.314 0.359
Law 0.337 0.329 0.37
Moral 0.351 0.334 0.403
Justice 0.251 0.266 0.327
Utilitarianism 0.249 0.295 0.329

Information about the submission:

Mera version
v.1.2.0
Torch Version
2.4.0
The version of the codebase
db539c9
CUDA version
12.1
Precision of the model weights
-
Seed
1234
Butch
1
Transformers version
4.46.0.dev0
The number of GPUs and their type
5 x NVIDIA H100 80GB HBM3
Architecture
gigachat_llms

Team:

GIGACHAT

Name of the ML model:

GigaChat-20B-A3B

Model size

20.0B

Model type:

Opened

SFT

MoE

Architecture description:

GigaChat-20B-A3B is a Large Language Model (LLM) that was fine-tuned on instruction corpus and has context length of 32k tokens. GigaChat-20B-A3B is Mixture of Experts model and has 3.3B active parameters. The model is available at https://huggingface.co/ai-sage/GigaChat-20B-A3B-instruct as instruct version and https://huggingface.co/ai-sage/GigaChat-20B-A3B-base as base version

Description of the training:

-

Pretrain data:

-

License:

Open-source model by Sber

Inference parameters

Generation Parameters:
simplear - do_sample=false;until=["\n"]; \nchegeka - do_sample=false;until=["\n"]; \nrudetox - do_sample=false;until=["\n"]; \nrumultiar - do_sample=false;until=["\n"]; \nuse - do_sample=false;until=["\n","."]; \nmultiq - do_sample=false;until=["\n"]; \nrumodar - do_sample=false;until=["\n"]; \nruhumaneval - do_sample=true;until=["\nclass","\ndef","\n#","\nif","\nprint"];temperature=0.6; \nrucodeeval - do_sample=true;until=["\nclass","\ndef","\n#","\nif","\nprint"];temperature=0.6;

Description of the template:
{% if messages[0]['role'] == 'system' -%}\n {%- set loop_messages = messages[1:] -%}\n {%- set system_message = bos_token + messages[0]['content'] + additional_special_tokens[1] -%}\n{%- else -%}\n {%- set loop_messages = messages -%}\n {%- set system_message = bos_token + '' -%}\n{%- endif -%}\n{%- for message in loop_messages %}\n {% if (message['role'] == 'user') != (loop.index0 % 2 == 0) %}\n {{ raise_exception('Conversation roles must alternate user/assistant/user/assistant/...') }}\n {% endif %}\n \n {%- if loop.index0 == 0 -%}\n {{ system_message -}}\n {%- endif -%}\n {%- if message['role'] == 'user' -%}\n {{ message['role'] + additional_special_tokens[0] + message['content'] + additional_special_tokens[1] -}}\n {{ 'available functions' + additional_special_tokens[0] + additional_special_tokens[2] + additional_special_tokens[3] + additional_special_tokens[1] -}}\n {%- endif -%}\n {%- if message['role'] == 'assistant' -%}\n {{ message['role'] + additional_special_tokens[0] + message['content'] + additional_special_tokens[1] -}}\n {%- endif -%}\n {%- if loop.last and add_generation_prompt -%}\n {{ 'assistant' + additional_special_tokens[0] -}}\n {%- endif -%}\n{%- endfor %}

Expand information

Ratings by subcategory

Metric: Grade Norm
Model, team 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 8_0 8_1 8_2 8_3 8_4
GigaChat-20B-A3B
GIGACHAT
0.9 0.233 0.767 0.2 0.133 0.433 0.167 - 0.133 0.167 0.167 0.133 0.533 0.133 0.067 0.567 0 0.067 0.033 0.033 0.067 0.567 0.433 0.033 0.033 0.7 0.333 0.433 0.433 0.3 0.5
Model, team Honest Helpful Harmless
GigaChat-20B-A3B
GIGACHAT
0.689 0.797 0.707
Model, team Anatomy Virology Astronomy Marketing Nutrition Sociology Management Philosophy Prehistory Human aging Econometrics Formal logic Global facts Jurisprudence Miscellaneous Moral disputes Business ethics Biology (college) Physics (college) Human Sexuality Moral scenarios World religions Abstract algebra Medicine (college) Machine learning Medical genetics Professional law PR Security studies Chemistry (школьная) Computer security International law Logical fallacies Politics Clinical knowledge Conceptual_physics Math (college) Biology (high school) Physics (high school) Chemistry (high school) Geography (high school) Professional medicine Electrical engineering Elementary mathematics Psychology (high school) Statistics (high school) History (high school) Math (high school) Professional accounting Professional psychology Computer science (college) World history (high school) Macroeconomics Microeconomics Computer science (high school) European history Government and politics
GigaChat-20B-A3B
GIGACHAT
0.533 0.476 0.697 0.825 0.706 0.811 0.728 0.659 0.664 0.61 0.377 0.476 0.36 0.713 0.764 0.618 0.63 0.667 0.389 0.702 0.316 0.772 0.37 0.584 0.384 0.65 0.403 0.62 0.714 0.41 0.67 0.769 0.663 0.717 0.675 0.513 0.39 0.752 0.371 0.488 0.793 0.621 0.586 0.454 0.792 0.481 0.779 0.367 0.404 0.582 0.47 0.768 0.659 0.664 0.69 0.739 0.798
Model, team SIM FL STA
GigaChat-20B-A3B
GIGACHAT
0.35 0.783 0.748
Model, team Anatomy Virology Astronomy Marketing Nutrition Sociology Managment Philosophy Pre-History Gerontology Econometrics Formal logic Global facts Jurisprudence Miscellaneous Moral disputes Business ethics Bilology (college) Physics (college) Human sexuality Moral scenarios World religions Abstract algebra Medicine (college) Machine Learning Genetics Professional law PR Security Chemistry (college) Computer security International law Logical fallacies Politics Clinical knowledge Conceptual physics Math (college) Biology (high school) Physics (high school) Chemistry (high school) Geography (high school) Professional medicine Electrical Engineering Elementary mathematics Psychology (high school) Statistics (high school) History (high school) Math (high school) Professional Accounting Professional psychology Computer science (college) World history (high school) Macroeconomics Microeconomics Computer science (high school) Europe History Government and politics
GigaChat-20B-A3B
GIGACHAT
0.511 0.822 0.617 0.62 0.816 0.845 0.638 0.719 0.788 0.646 0.756 0.692 0.475 0.822 0.731 0.765 0.71 0.644 0.667 0.737 0.246 0.78 0.689 0.787 0.711 0.803 0.821 0.684 0.825 0.733 0.844 0.833 0.723 0.912 0.667 0.732 0.667 0.822 0.614 0.692 0.857 0.841 0.778 0.689 0.879 0.867 0.914 0.727 0.831 0.877 0.756 0.754 0.81 0.662 0.465 0.743 0.789
Coorect
Good
Ethical
Model, team Virtue Law Moral Justice Utilitarianism
GigaChat-20B-A3B
GIGACHAT
0.299 0.337 0.351 0.251 0.249
Model, team Virtue Law Moral Justice Utilitarianism
GigaChat-20B-A3B
GIGACHAT
0.314 0.329 0.334 0.266 0.295
Model, team Virtue Law Moral Justice Utilitarianism
GigaChat-20B-A3B
GIGACHAT
0.359 0.37 0.403 0.327 0.329
Model, team Women Men LGBT Nationalities Migrants Other
GigaChat-20B-A3B
GIGACHAT
0.815 0.629 0.647 0.757 0.714 0.852