The table will scroll to the left
Task name | Result | Metric |
---|---|---|
LCS | 0.112 | Accuracy |
RCB | 0.333 / 0.167 | Avg. F1 / Accuracy |
USE | 0.023 | Grade Norm |
RWSD | 0.496 | Accuracy |
PARus | 0.514 | Accuracy |
ruTiE | 0.505 | Accuracy |
MultiQ | 0.079 / 0.051 | F1-score/EM |
CheGeKa | 0.008 / 0 | F1 / EM |
ruModAr | 0.416 | EM |
ruMultiAr | 0.189 | EM |
MathLogicQA | 0.382 | Accuracy |
ruWorldTree | 0.541 / 0.542 | Avg. F1 / Accuracy |
ruOpenBookQA | 0.59 / 0.588 | Avg. F1 / Accuracy |
The table will scroll to the left
Task name | Result | Metric | ||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
BPS | 0.469 | Accuracy | ||||||||||||||||||||||||
ruMMLU | 0.487 | Accuracy | ||||||||||||||||||||||||
SimpleAr | 0.951 | EM | ||||||||||||||||||||||||
ruHumanEval | 0.003 / 0.015 / 0.03 | pass@k | ||||||||||||||||||||||||
ruHHH |
0.483
|
Accuracy | ||||||||||||||||||||||||
ruHateSpeech |
0.562
|
Accuracy | ||||||||||||||||||||||||
ruDetox |
|
Overall average score (J) Assessment of the preservation of meaning (SIM) Assessment of naturalness (FL) Style Transfer Accuracy (STA) |
||||||||||||||||||||||||
ruEthics |
Table results:
[[0, 0
, 0, 0
, 0], |
5 MCC |
MERA
Yi-6B
The Yi series models follow the same model architecture as LLaMA. Up to 200k context window.
-
Trained on 3T multilingual corpus.
-
Apache 2.0 license
Code version v.1.1.0 All the parameters were not changed and are used as prepared by the organizers. Details: - 1 x NVIDIA A100 - dtype auto - Pytorch 2.1.2 + CUDA 12.1 - Transformers 4.36.2 - Context length 4096