GigaChat Lite

Created at 15.07.2024 06:04

General assessment: 0.504

The table will scroll to the left

Task name Result Metric
BPS 0.412 Accuracy
LCS 0.084 Accuracy
RCB 0.543 / 0.452 Avg. F1 / Accuracy
USE 0.284 Grade Norm
RWSD 0.627 Accuracy
PARus 0.848 Accuracy
ruTiE 0.726 Accuracy
MultiQ 0.193 / 0.071 F1-score/EM
ruMMLU 0.783 Accuracy
CheGeKa 0.063 / 0 F1 / EM
ruModAr 0.77 EM
SimpleAr 0.9 EM
ruMultiAr 0.216 EM
MathLogicQA 0.45 Accuracy
ruHumanEval 0.018 / 0.088 / 0.177 pass@k
ruWorldTree 0.897 / 0.897 Avg. F1 / Accuracy
ruOpenBookQA 0.823 / 0.822 Avg. F1 / Accuracy

Evaluation on diagnostic datasets:

It is not taken into account in the overall rating

The table will scroll to the left

Task name Result Metric
ruHHH

0.753

  • Honest: 0.721
  • Harmless: 0.81
  • Helpful: 0.729
Accuracy
ruHateSpeech

0.774

  • Women : 0.759
  • Man : 0.8
  • LGBT : 0.706
  • Nationality : 0.73
  • Migrants : 0.429
  • Other : 0.869
Accuracy
ruDetox
  • 0.05
  • 0.307
  • 0.821
  • 0.147

Overall average score (J)

Assessment of the preservation of meaning (SIM)

Assessment of naturalness (FL)

Style Transfer Accuracy (STA)

ruEthics
Correct God Ethical
Virtue -0.336 -0.294 -0.314
Law -0.332 -0.3 -0.301
Moral -0.351 -0.305 -0.323
Justice -0.31 -0.261 -0.273
Utilitarianism -0.237 -0.201 -0.242

Table results:

[[-0.336, -0.332 , -0.351, -0.31 , -0.237],
[-0.294, -0.3 , -0.305, -0.261 , -0.201],
[-0.314, -0.301 , -0.323, -0.273 , -0.242]]

5 MCC

Information about the submission:

Team:

GIGACHAT

Name of the ML model:

GigaChat Lite

Additional links:

https://developers.sber.ru/docs/ru/gigachat/api/overview

Architecture description:

GigaChat Lite (version `GigaChat:4.0.26.8`) is a Large Language Model (LLM) with 7B parameters that was fine-tuned on instruction corpus and has context length of 8192 tokens. The version is available for users via API since 13.07.

Description of the training:

-

Pretrain data:

-

Training Details:

-

License:

-

Strategy, generation and parameters:

Code version v.1.1.0. All the parameters were not changed and are used as prepared by the organizers. Details: - 2 x NVIDIA A100 + accelerate - dtype float16 - Pytorch 2.3.1 + CUDA 12.1 - Transformers 4.42.3 - Context length 8192