The table will scroll to the left
Task name | Result | Metric |
---|---|---|
LCS | 0.116 | Accuracy |
RCB | 0.333 / 0.167 | Avg. F1 / Accuracy |
USE | 0.004 | Grade Norm |
RWSD | 0.5 | Accuracy |
PARus | 0.498 | Accuracy |
ruTiE | 0.5 | Accuracy |
MultiQ | 0.104 / 0.023 | F1-score/EM |
CheGeKa | 0.007 / 0 | F1 / EM |
ruModAr | 0.001 | EM |
ruMultiAr | 0.01 | EM |
MathLogicQA | 0.222 | Accuracy |
ruWorldTree | 0.265 / 0.222 | Avg. F1 / Accuracy |
ruOpenBookQA | 0.248 / 0.201 | Avg. F1 / Accuracy |
The table will scroll to the left
Task name | Result | Metric | ||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
BPS | 0.461 | Accuracy | ||||||||||||||||||||||||
ruMMLU | 0.275 | Accuracy | ||||||||||||||||||||||||
SimpleAr | 0.006 | EM | ||||||||||||||||||||||||
ruHumanEval | 0 / 0 / 0 | pass@k | ||||||||||||||||||||||||
ruHHH |
0.478
|
Accuracy | ||||||||||||||||||||||||
ruHateSpeech |
0.543
|
Accuracy | ||||||||||||||||||||||||
ruDetox |
|
Overall average score (J) Assessment of the preservation of meaning (SIM) Assessment of naturalness (FL) Style Transfer Accuracy (STA) |
||||||||||||||||||||||||
ruEthics |
Table results:
[[0, 0
, 0, 0
, 0], |
5 MCC |
0x7o
Aeonium v1.1 Base 4B
Llama; 4k context; 4B parameters
Pre-training only
Web pages, news, literature, wikipedia
One epoch on TPU v4-256. lr = 0.0003
Apache 2.0
- lm-harness v1.1.0 - 1xRTX 4090 - transformers v4.41