GigaChat Pro

Created at 29.01.2024 12:24

General assessment: 0.514

The table will scroll to the left

Task name Result Metric
BPS 0.224 Accuracy
LCS 0.12 Accuracy
RCB 0.562 / 0.484 Avg. F1 / Accuracy
USE 0.169 Grade Norm
RWSD 0.546 Accuracy
PARus 0.896 Accuracy
ruTiE 0.779 Accuracy
MultiQ 0.192 / 0.097 F1-score/EM
ruMMLU 0.811 Accuracy
CheGeKa 0.451 / 0.363 F1 / EM
ruModAr 0.589 Accuracy
SimpleAr 0.96 Accuracy
ruMultiAr 0.226 Accuracy
MathLogicQA 0.395 Accuracy
ruHumanEval 0.021 / 0.107 / 0.213 pass@k
ruWorldTree 0.96 / 0.96 Avg. F1 / Accuracy
ruOpenBookQA 0.875 / 0.874 Avg. F1 / Accuracy

Evaluation on diagnostic datasets:

It is not taken into account in the overall rating

The table will scroll to the left

Task name Result Metric
ruHHH

0.792

  • Honest: 0.705
  • Harmless: 0.897
  • Helpful: 0.78
Accuracy
ruHateSpeech

0.758

  • Women : 0.75
  • Man : 0.771
  • LGBT : 0.706
  • Nationality : 0.703
  • Migrants : 0.571
  • Other : 0.836
Accuracy
ruDetox
  • 0.128
  • 0.504
  • 0.777
  • 0.276

Overall average score (J)

Assessment of the preservation of meaning (SIM)

Assessment of naturalness (FL)

Style Transfer Accuracy (STA)

ruEthics
Correct God Ethical
Virtue -0.516 -0.529 -0.538
Law -0.548 -0.527 -0.56
Moral -0.492 -0.515 -0.541
Justice -0.498 -0.494 -0.505
Utilitarianism -0.422 -0.421 -0.435

Table results:

[[-0.516, -0.548 , -0.492, -0.498 , -0.422],
[-0.529, -0.527 , -0.515, -0.494 , -0.421],
[-0.538, -0.56 , -0.541, -0.505 , -0.435]]

5 MCC

Information about the submission:

Team:

SberDevices

Name of the ML model:

GigaChat Pro

Architecture description:

GigaChat Pro is a Large Language Model (LLM) with 29B parameters that was fine-tuned on instruction corpus and has context length of 8192 tokens.

Description of the training:

-

Pretrain data:

-

Training Details:

-

License:

Проприетарная модель от Sber

Strategy, generation and parameters:

Code version v.1.1.0 All the parameters were not changed and are used as prepared by the organizers. Details: - 2 x NVIDIA A100 + accelerate - dtype float16 - Pytorch 2.0.1 + CUDA 11.7 - Transformers 4.36.2 - Context length 8192