Mixtral 8x7B Instruct

Created at 03.02.2024 13:28

General assessment: 0.478

The table will scroll to the left

Task name Result Metric
BPS 0.157 Accuracy
LCS 0.082 Accuracy
RCB 0.521 / 0.48 Avg. F1 / Accuracy
USE 0.069 Grade Norm
RWSD 0.635 Accuracy
PARus 0.858 Accuracy
ruTiE 0.695 Accuracy
MultiQ 0.151 / 0.071 F1-score/EM
ruMMLU 0.776 Accuracy
CheGeKa 0.071 / 0 F1 / EM
ruModAr 0.674 Accuracy
SimpleAr 0.977 Accuracy
ruMultiAr 0.288 Accuracy
MathLogicQA 0.408 Accuracy
ruHumanEval 0.024 / 0.122 / 0.244 pass@k
ruWorldTree 0.907 / 0.907 Avg. F1 / Accuracy
ruOpenBookQA 0.825 / 0.825 Avg. F1 / Accuracy

Evaluation on diagnostic datasets:

It is not taken into account in the overall rating

The table will scroll to the left

Task name Result Metric
ruHHH

0.747

  • Honest: 0.656
  • Harmless: 0.862
  • Helpful: 0.729
Accuracy
ruHateSpeech

0.785

  • Women : 0.787
  • Man : 0.771
  • LGBT : 0.588
  • Nationality : 0.811
  • Migrants : 0.571
  • Other : 0.852
Accuracy
ruDetox
  • 0.068
  • 0.403
  • 0.733
  • 0.193

Overall average score (J)

Assessment of the preservation of meaning (SIM)

Assessment of naturalness (FL)

Style Transfer Accuracy (STA)

ruEthics
Correct God Ethical
Virtue -0.352 -0.459 -0.472
Law -0.409 -0.45 -0.484
Moral -0.387 -0.49 -0.496
Justice -0.349 -0.397 -0.439
Utilitarianism -0.312 -0.362 -0.39

Table results:

[[-0.352, -0.409 , -0.387, -0.349 , -0.312],
[-0.459, -0.45 , -0.49, -0.397 , -0.362],
[-0.472, -0.484 , -0.496, -0.439 , -0.39]]

5 MCC

Information about the submission:

Team:

Russian_NLP

Name of the ML model:

Mixtral 8x7B Instruct

Additional links:

https://mistral.ai/news/mixtral-of-experts/ https://huggingface.co/mistralai/Mixtral-8x7B-v0.1

Architecture description:

Mixtral 8x7B Instruct is the instructed version of Mixtral 8x7B. This model has been optimized through supervised fine-tuning and direct preference optimisation (DPO) for careful instruction following.

Description of the training:

Mixtral is pre-trained on data extracted from the open Web – we train experts and routers simultaneously. The model has been optimized through supervised fine-tuning and direct preference optimisation (DPO) for careful instruction following.

Pretrain data:

The Mixtral-8x7B Large Language Model (LLM) is a pretrained generative Sparse Mixture of Experts. The Mistral-8x7B outperforms Llama 2 70B on most benchmarks.

Training Details:

-

License:

Apache 2.0.

Strategy, generation and parameters:

Code version v.1.1.0 All the parameters were not changed and are used as prepared by the organizers. Details: - 2 x NVIDIA A100 + accelerate - dtype bfloat16 - Pytorch 2.0.1 + CUDA 11.7 - Transformers 4.36.2 - Context length 10624