GigaChat-Pro

GIGACHAT Created at 04.07.2024 10:42
0.537
The overall result
The submission does not contain all the required tasks

Ratings for leaderboard tasks

The table will scroll to the left

Task name Result Metric
LCS 0.09 Accuracy
RCB 0.53 / 0.449 Accuracy F1 macro
USE 0.338 Grade norm
RWSD 0.585 Accuracy
PARus 0.884 Accuracy
ruTiE 0.791 Accuracy
MultiQ 0.369 / 0.247 F1 Exact match
CheGeKa 0.104 / 0 F1 Exact match
ruModAr 0.866 Exact match
ruMultiAr 0.273 Exact match
MathLogicQA 0.467 Accuracy
ruWorldTree 0.939 / 0.939 Accuracy F1 macro
ruOpenBookQA 0.873 / 0.872 Accuracy F1 macro

Evaluation on open tasks:

Go to the ratings by subcategory

The table will scroll to the left

Task name Result Metric
BPS 0.318 Accuracy
ruMMLU 0.816 Accuracy
SimpleAr 0.971 Exact match
ruHumanEval 0.013 / 0.064 / 0.128 Pass@k
ruHHH 0.764
ruHateSpeech 0.751
ruDetox 0.238
ruEthics
Correct God Ethical
Virtue -0.493 -0.449 -0.394
Law -0.493 -0.423 -0.392
Moral -0.492 -0.464 -0.399
Justice -0.447 -0.4 -0.345
Utilitarianism -0.422 -0.374 -0.322

Information about the submission:

Mera version
-
Torch Version
-
The version of the codebase
-
CUDA version
-
Precision of the model weights
-
Seed
-
Butch
-
Transformers version
-
The number of GPUs and their type
-
Architecture
-

Team:

GIGACHAT

Name of the ML model:

GigaChat-Pro

Additional links:

https://developers.sber.ru/docs/ru/gigachat/api/overview

Architecture description:

GigaChat Pro (version 1.0.26.8) is a Large Language Model (LLM) with 30B parameters that was fine-tuned on instruction corpus and has context length of 8192 tokens. The version is available for users via API since 13.07.

Description of the training:

-

Pretrain data:

-

Training Details:

-

License:

Proprietary model by Sber

Strategy, generation and parameters:

Code version v.1.1.0. All the parameters were not changed and are used as prepared by the organizers. Details: - 2 x NVIDIA A100 + accelerate - dtype float16 - Pytorch 2.3.1 + CUDA 12.1 - Transformers 4.42.3 - Context length 8192

Expand information

Ratings by subcategory

Metric: Accuracy
Model, team Honest Helpful Harmless
GigaChat-Pro
GIGACHAT
0.689 0.78 0.828
Model, team Anatomy Virology Astronomy Marketing Nutrition Sociology Management Philosophy Prehistory Human aging Econometrics Formal logic Global facts Jurisprudence Miscellaneous Moral disputes Business ethics Biology (college) Physics (college) Human Sexuality Moral scenarios World religions Abstract algebra Medicine (college) Machine learning Medical genetics Professional law PR Security studies Chemistry (школьная) Computer security International law Logical fallacies Politics Clinical knowledge Conceptual_physics Math (college) Biology (high school) Physics (high school) Chemistry (high school) Geography (high school) Professional medicine Electrical engineering Elementary mathematics Psychology (high school) Statistics (high school) History (high school) Math (high school) Professional accounting Professional psychology Computer science (college) World history (high school) Macroeconomics Microeconomics Computer science (high school) European history Government and politics
GigaChat-Pro
GIGACHAT
0.9 0.938 0.8 0.657 0.952 1 0.867 0.647 0.7 1 0.818 0.9 0.7 0.769 0.682 0.6 0.8 0.889 0.9 1 0.4 0.846 0.9 0.863 0.9 0.727 0.75 0.643 1 0.818 0.6 0.944 0.7 1 0.818 1 0.8 0.905 0.7 0.6 0.899 1 1 0.5 1 0.9 1 0.3 0.9 0.9 0.636 0.875 0.941 0.867 0.583 0.727 0.778
Model, team SIM FL STA
GigaChat-Pro
GIGACHAT
0.59 0.76 0.459
Coorect
Good
Ethical
Model, team Virtue Law Moral Justice Utilitarianism
GigaChat-Pro
GIGACHAT
-0.493 -0.493 -0.492 -0.447 -0.422
Model, team Virtue Law Moral Justice Utilitarianism
GigaChat-Pro
GIGACHAT
-0.449 -0.423 -0.464 -0.4 -0.374
Model, team Virtue Law Moral Justice Utilitarianism
GigaChat-Pro
GIGACHAT
-0.394 -0.392 -0.399 -0.345 -0.322
Model, team Women Men LGBT Nationalities Migrants Other
GigaChat-Pro
GIGACHAT
0.759 0.8 0.647 0.649 0.429 0.836