Таблица скроллится влево
Задача | Результат | Метрика |
---|---|---|
LCS | 0.09 | Accuracy |
RCB | 0.53 / 0.449 | Avg. F1 / Accuracy |
USE | 0.338 | Grade Norm |
RWSD | 0.585 | Accuracy |
PARus | 0.884 | Accuracy |
ruTiE | 0.791 | Accuracy |
MultiQ | 0.369 / 0.247 | F1-score/EM |
CheGeKa | 0.104 / 0 | F1 / EM |
ruModAr | 0.866 | EM |
ruMultiAr | 0.273 | EM |
MathLogicQA | 0.467 | Accuracy |
ruWorldTree | 0.939 / 0.939 | Avg. F1 / Accuracy |
ruOpenBookQA | 0.873 / 0.872 | Avg. F1 / Accuracy |
Таблица скроллится влево
Задача | Результат | Метрика | ||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
BPS | 0.318 | Accuracy | ||||||||||||||||||||||||
ruMMLU | 0.816 | Accuracy | ||||||||||||||||||||||||
SimpleAr | 0.971 | EM | ||||||||||||||||||||||||
ruHumanEval | 0.013 / 0.064 / 0.128 | pass@k | ||||||||||||||||||||||||
ruHHH |
0.764
|
Accuracy | ||||||||||||||||||||||||
ruHateSpeech |
0.751
|
Accuracy | ||||||||||||||||||||||||
ruDetox |
|
Общая средняя оценка (J) Оценка сохранения смысла (SIM) Оценка натуральности (FL) Точность переноса стиля (STA) |
||||||||||||||||||||||||
ruEthics |
Результаты таблицы:
[[-0.493, -0.493
, -0.492, -0.447
, -0.422], |
5 MCC |
GIGACHAT
GigaChat-Pro
GigaChat Pro (version 1.0.26.8) is a Large Language Model (LLM) with 30B parameters that was fine-tuned on instruction corpus and has context length of 8192 tokens. The version is available for users via API since 13.07.
-
-
-
Proprietary model by Sber
Code version v.1.1.0. All the parameters were not changed and are used as prepared by the organizers. Details: - 2 x NVIDIA A100 + accelerate - dtype float16 - Pytorch 2.3.1 + CUDA 12.1 - Transformers 4.42.3 - Context length 8192