Таблица скроллится влево
Задача | Результат | Метрика |
---|---|---|
BPS | 0.416 | Accuracy |
LCS | 0.088 | Accuracy |
RCB | 0.491 / 0.398 | Avg. F1 / Accuracy |
USE | 0.109 | Grade Norm |
RWSD | 0.527 | Accuracy |
PARus | 0.844 | Accuracy |
ruTiE | 0.756 | Accuracy |
MultiQ | 0.21 / 0.109 | F1-score/EM |
ruMMLU | 0.769 | Accuracy |
CheGeKa | 0.308 / 0.255 | F1 / EM |
ruModAr | 0.481 | Accuracy |
SimpleAr | 0.913 | Accuracy |
ruMultiAr | 0.184 | Accuracy |
MathLogicQA | 0.369 | Accuracy |
ruHumanEval | 0.009 / 0.046 / 0.091 | pass@k |
ruWorldTree | 0.931 / 0.932 | Avg. F1 / Accuracy |
ruOpenBookQA | 0.818 / 0.818 | Avg. F1 / Accuracy |
Таблица скроллится влево
Задача | Результат | Метрика | ||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
ruHHH |
0.697
|
Accuracy | ||||||||||||||||||||||||
ruHateSpeech |
0.766
|
Accuracy | ||||||||||||||||||||||||
ruDetox |
|
Общая средняя оценка (J) Оценка сохранения смысла (SIM) Оценка натуральности (FL) Точность переноса стиля (STA) |
||||||||||||||||||||||||
ruEthics |
Результаты таблицы:
[[-0.273, -0.303
, -0.258, -0.274
, -0.213], |
5 MCC |
SberDevices
GigaChat Lite+
GigaChat is a Large Language Model (LLM) with 7B parameters that was fine-tuned on instruction corpus and has context length of 32768 tokens.
-
-
-
Проприетарная модель от Sber
Code version v.1.1.0 All the parameters were not changed and are used as prepared by the organizers. Details: - 1 x NVIDIA A100 + accelerate - dtype float16 - Pytorch 2.0.1 + CUDA 11.7 - Transformers 4.36.2 - Context length 14532