The table will scroll to the left
Task name | Result | Metric |
---|---|---|
LCS | 0.108 | Accuracy |
RCB | 0.498 / 0.402 | Avg. F1 / Accuracy |
USE | 0.049 | Grade Norm |
RWSD | 0.562 | Accuracy |
PARus | 0.74 | Accuracy |
ruTiE | 0.602 | Accuracy |
MultiQ | 0.185 / 0.107 | F1-score/EM |
CheGeKa | 0.01 / 0 | F1 / EM |
ruModAr | 0.635 | EM |
ruMultiAr | 0.277 | EM |
MathLogicQA | 0.473 | Accuracy |
ruWorldTree | 0.838 / 0.838 | Avg. F1 / Accuracy |
ruOpenBookQA | 0.748 / 0.746 | Avg. F1 / Accuracy |
The table will scroll to the left
Task name | Result | Metric | ||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
BPS | 0.426 | Accuracy | ||||||||||||||||||||||||
ruMMLU | 0.676 | Accuracy | ||||||||||||||||||||||||
SimpleAr | 0.981 | EM | ||||||||||||||||||||||||
ruHumanEval | 0.004 / 0.021 / 0.043 | pass@k | ||||||||||||||||||||||||
ruHHH |
0.601
|
Accuracy | ||||||||||||||||||||||||
ruHateSpeech |
0.626
|
Accuracy | ||||||||||||||||||||||||
ruDetox |
|
Overall average score (J) Assessment of the preservation of meaning (SIM) Assessment of naturalness (FL) Style Transfer Accuracy (STA) |
||||||||||||||||||||||||
ruEthics |
Table results:
[[-0.12, -0.113
, -0.108, -0.125
, -0.082], |
5 MCC |
LM Research
Yi 34B 200K
The Yi 34B follow the same model architecture as LLaMA with a 200k context window size.
Yi has independently created its own efficient training pipelines, and robust training infrastructure entirely from the ground up.
Trained on 3T multilingual corpus.
Yi has independently created its own high-quality training datasets, efficient training pipelines, and robust training infrastructure entirely from the ground up.
Apache 2.0 license
Code version v.1.1.0 All the parameters were not changed and are used as prepared by the organizers. Details: - 2 x NVIDIA A100 - dtype float16 - Pytorch 2.1.2 + CUDA 12.1 - Transformers 4.36.2 - Context length 11000