MTS AI Chat 7B

Created at 11.02.2024 22:10

General assessment: 0.479

The table will scroll to the left

Task name Result Metric
BPS 0.276 Accuracy
LCS 0.094 Accuracy
RCB 0.532 / 0.53 Avg. F1 / Accuracy
USE 0.128 Grade Norm
RWSD 0.615 Accuracy
PARus 0.834 Accuracy
ruTiE 0.574 Accuracy
MultiQ 0.361 / 0.278 F1-score/EM
ruMMLU 0.689 Accuracy
CheGeKa 0.083 / 0.046 F1 / EM
ruModAr 0.717 Accuracy
SimpleAr 0.955 Accuracy
ruMultiAr 0.233 Accuracy
MathLogicQA 0.407 Accuracy
ruHumanEval 0.018 / 0.088 / 0.177 pass@k
ruWorldTree 0.846 / 0.845 Avg. F1 / Accuracy
ruOpenBookQA 0.763 / 0.762 Avg. F1 / Accuracy

Evaluation on diagnostic datasets:

It is not taken into account in the overall rating

The table will scroll to the left

Task name Result Metric
ruHHH

0.719

  • Honest: 0.672
  • Harmless: 0.828
  • Helpful: 0.661
Accuracy
ruHateSpeech

0.758

  • Women : 0.75
  • Man : 0.771
  • LGBT : 0.765
  • Nationality : 0.757
  • Migrants : 0.571
  • Other : 0.787
Accuracy
ruDetox
  • 0.229
  • 0.724
  • 0.584
  • 0.517

Overall average score (J)

Assessment of the preservation of meaning (SIM)

Assessment of naturalness (FL)

Style Transfer Accuracy (STA)

ruEthics
Correct God Ethical
Virtue -0.276 -0.313 -0.419
Law -0.28 -0.283 -0.381
Moral -0.279 -0.319 -0.417
Justice -0.247 -0.295 -0.378
Utilitarianism -0.223 -0.267 -0.338

Table results:

[[-0.276, -0.28 , -0.279, -0.247 , -0.223],
[-0.313, -0.283 , -0.319, -0.295 , -0.267],
[-0.419, -0.381 , -0.417, -0.378 , -0.338]]

5 MCC

Information about the submission:

Team:

MTS AI

Name of the ML model:

MTS AI Chat 7B

Architecture description:

Mistral 7B model architecture

Description of the training:

Mistral trained on proprietary DPO and SFT datasets

Pretrain data:

-

Training Details:

-

License:

Proprietary model developed by MTS AI

Strategy, generation and parameters:

Code version v.1.1.0 All the parameters were not changed. Inference details: torch 2.1.0 + Cuda 11.8. max length 6012 tokens

Comments about inference:

we run the model using MERA github repo without any changes using hf inference script