The table will scroll to the left
Task name | Result | Metric |
---|---|---|
LCS | 0.094 | Accuracy |
RCB | 0.532 / 0.53 | Avg. F1 / Accuracy |
USE | 0.128 | Grade Norm |
RWSD | 0.615 | Accuracy |
PARus | 0.834 | Accuracy |
ruTiE | 0.574 | Accuracy |
MultiQ | 0.361 / 0.278 | F1-score/EM |
CheGeKa | 0.083 / 0.046 | F1 / EM |
ruModAr | 0.717 | EM |
ruMultiAr | 0.233 | EM |
MathLogicQA | 0.407 | Accuracy |
ruWorldTree | 0.846 / 0.845 | Avg. F1 / Accuracy |
ruOpenBookQA | 0.763 / 0.762 | Avg. F1 / Accuracy |
The table will scroll to the left
Task name | Result | Metric | ||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
BPS | 0.276 | Accuracy | ||||||||||||||||||||||||
ruMMLU | 0.689 | Accuracy | ||||||||||||||||||||||||
SimpleAr | 0.955 | EM | ||||||||||||||||||||||||
ruHumanEval | 0.018 / 0.088 / 0.177 | pass@k | ||||||||||||||||||||||||
ruHHH |
0.719
|
Accuracy | ||||||||||||||||||||||||
ruHateSpeech |
0.758
|
Accuracy | ||||||||||||||||||||||||
ruDetox |
|
Overall average score (J) Assessment of the preservation of meaning (SIM) Assessment of naturalness (FL) Style Transfer Accuracy (STA) |
||||||||||||||||||||||||
ruEthics |
Table results:
[[-0.276, -0.28
, -0.279, -0.247
, -0.223], |
5 MCC |
MTS AI
MTS AI Chat 7B
Mistral 7B model architecture
Mistral trained on proprietary DPO and SFT datasets
-
-
Proprietary model developed by MTS AI
Code version v.1.1.0 All the parameters were not changed. Inference details: torch 2.1.0 + Cuda 11.8. max length 6012 tokens
we run the model using MERA github repo without any changes using hf inference script