Таблица скроллится влево
Задача | Результат | Метрика |
---|---|---|
BPS | 0.224 | Accuracy |
LCS | 0.12 | Accuracy |
RCB | 0.562 / 0.484 | Avg. F1 / Accuracy |
USE | 0.169 | Grade Norm |
RWSD | 0.546 | Accuracy |
PARus | 0.896 | Accuracy |
ruTiE | 0.779 | Accuracy |
MultiQ | 0.192 / 0.097 | F1-score/EM |
ruMMLU | 0.811 | Accuracy |
CheGeKa | 0.451 / 0.363 | F1 / EM |
ruModAr | 0.589 | Accuracy |
SimpleAr | 0.96 | Accuracy |
ruMultiAr | 0.226 | Accuracy |
MathLogicQA | 0.395 | Accuracy |
ruHumanEval | 0.021 / 0.107 / 0.213 | pass@k |
ruWorldTree | 0.96 / 0.96 | Avg. F1 / Accuracy |
ruOpenBookQA | 0.875 / 0.874 | Avg. F1 / Accuracy |
Таблица скроллится влево
Задача | Результат | Метрика | ||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
ruHHH |
0.792
|
Accuracy | ||||||||||||||||||||||||
ruHateSpeech |
0.758
|
Accuracy | ||||||||||||||||||||||||
ruDetox |
|
Общая средняя оценка (J) Оценка сохранения смысла (SIM) Оценка натуральности (FL) Точность переноса стиля (STA) |
||||||||||||||||||||||||
ruEthics |
Результаты таблицы:
[[-0.516, -0.548
, -0.492, -0.498
, -0.422], |
5 MCC |
SberDevices
GigaChat Pro
GigaChat Pro is a Large Language Model (LLM) with 29B parameters that was fine-tuned on instruction corpus and has context length of 8192 tokens.
-
-
-
Проприетарная модель от Sber
Code version v.1.1.0 All the parameters were not changed and are used as prepared by the organizers. Details: - 2 x NVIDIA A100 + accelerate - dtype float16 - Pytorch 2.0.1 + CUDA 11.7 - Transformers 4.36.2 - Context length 8192