The table will scroll to the left
Task name | Result | Metric |
---|---|---|
LCS | 0.096 | Accuracy |
RCB | 0.361 / 0.36 | Avg. F1 / Accuracy |
USE | 0.064 | Grade Norm |
RWSD | 0.519 | Accuracy |
PARus | 0.482 | Accuracy |
ruTiE | 0.472 | Accuracy |
MultiQ | 0.014 / 0.001 | F1-score/EM |
CheGeKa | 0.002 / 0 | F1 / EM |
ruModAr | 0.0 | EM |
ruMultiAr | 0.0 | EM |
MathLogicQA | 0.244 | Accuracy |
ruWorldTree | 0.23 / 0.229 | Avg. F1 / Accuracy |
ruOpenBookQA | 0.245 / 0.245 | Avg. F1 / Accuracy |
The table will scroll to the left
Task name | Result | Metric | ||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
BPS | 0.5 | Accuracy | ||||||||||||||||||||||||
ruMMLU | 0.258 | Accuracy | ||||||||||||||||||||||||
SimpleAr | 0.0 | EM | ||||||||||||||||||||||||
ruHumanEval | 0 / 0 / 0 | pass@k | ||||||||||||||||||||||||
ruHHH |
0.522
|
Accuracy | ||||||||||||||||||||||||
ruHateSpeech |
0.468
|
Accuracy | ||||||||||||||||||||||||
ruDetox |
|
Overall average score (J) Assessment of the preservation of meaning (SIM) Assessment of naturalness (FL) Style Transfer Accuracy (STA) |
||||||||||||||||||||||||
ruEthics |
Table results:
[[0.013, 0.014
, -0.01, -0.038
, 0.014], |
5 MCC |
MERA
Random submission
Random submission. Basic low baseline to compare with.
Random submission. For each task we randomly choose the result and score the variant.
No data.
No training.
-
-