Mixtral 8x7B Instruct

Russian_NLP Created at 03.02.2024 13:28
0.478
The overall result
The submission does not contain all the required tasks

Ratings for leaderboard tasks

The table will scroll to the left

Task name Result Metric
LCS 0.082 Accuracy
RCB 0.521 / 0.48 Accuracy F1 macro
USE 0.069 Grade norm
RWSD 0.635 Accuracy
PARus 0.858 Accuracy
ruTiE 0.695 Accuracy
MultiQ 0.151 / 0.071 F1 Exact match
CheGeKa 0.071 / 0 F1 Exact match
ruModAr 0.674 Exact match
ruMultiAr 0.288 Exact match
MathLogicQA 0.408 Accuracy
ruWorldTree 0.907 / 0.907 Accuracy F1 macro
ruOpenBookQA 0.825 / 0.825 Accuracy F1 macro

Evaluation on open tasks:

Go to the ratings by subcategory

The table will scroll to the left

Task name Result Metric
BPS 0.157 Accuracy
ruMMLU 0.776 Accuracy
SimpleAr 0.977 Exact match
ruHumanEval 0.024 / 0.122 / 0.244 Pass@k
ruHHH 0.747
ruHateSpeech 0.785
ruDetox 0.068
ruEthics
Correct God Ethical
Virtue -0.352 -0.459 -0.472
Law -0.409 -0.45 -0.484
Moral -0.387 -0.49 -0.496
Justice -0.349 -0.397 -0.439
Utilitarianism -0.312 -0.362 -0.39

Information about the submission:

Mera version
-
Torch Version
-
The version of the codebase
-
CUDA version
-
Precision of the model weights
-
Seed
-
Butch
-
Transformers version
-
The number of GPUs and their type
-
Architecture
-

Team:

Russian_NLP

Name of the ML model:

Mixtral 8x7B Instruct

Additional links:

https://mistral.ai/news/mixtral-of-experts/ https://huggingface.co/mistralai/Mixtral-8x7B-v0.1

Architecture description:

Mixtral 8x7B Instruct is the instructed version of Mixtral 8x7B. This model has been optimized through supervised fine-tuning and direct preference optimisation (DPO) for careful instruction following.

Description of the training:

Mixtral is pre-trained on data extracted from the open Web – we train experts and routers simultaneously. The model has been optimized through supervised fine-tuning and direct preference optimisation (DPO) for careful instruction following.

Pretrain data:

The Mixtral-8x7B Large Language Model (LLM) is a pretrained generative Sparse Mixture of Experts. The Mistral-8x7B outperforms Llama 2 70B on most benchmarks.

Training Details:

-

License:

Apache 2.0.

Strategy, generation and parameters:

Code version v.1.1.0 All the parameters were not changed and are used as prepared by the organizers. Details: - 2 x NVIDIA A100 + accelerate - dtype bfloat16 - Pytorch 2.0.1 + CUDA 11.7 - Transformers 4.36.2 - Context length 10624

Expand information

Ratings by subcategory

Metric: Accuracy
Model, team Honest Helpful Harmless
Mixtral 8x7B Instruct
Russian_NLP
0.656 0.729 0.862
Model, team Anatomy Virology Astronomy Marketing Nutrition Sociology Management Philosophy Prehistory Human aging Econometrics Formal logic Global facts Jurisprudence Miscellaneous Moral disputes Business ethics Biology (college) Physics (college) Human Sexuality Moral scenarios World religions Abstract algebra Medicine (college) Machine learning Medical genetics Professional law PR Security studies Chemistry (школьная) Computer security International law Logical fallacies Politics Clinical knowledge Conceptual_physics Math (college) Biology (high school) Physics (high school) Chemistry (high school) Geography (high school) Professional medicine Electrical engineering Elementary mathematics Psychology (high school) Statistics (high school) History (high school) Math (high school) Professional accounting Professional psychology Computer science (college) World history (high school) Macroeconomics Microeconomics Computer science (high school) European history Government and politics
Mixtral 8x7B Instruct
Russian_NLP
0.8 0.75 0.9 0.714 0.905 1 0.8 0.765 0.8 0.9 0.818 0.7 0.5 0.577 0.455 0.7 0.7 0.667 0.8 1 0.4 0.904 1 0.804 0.7 0.909 0.75 0.857 1 0.909 0.5 0.889 0.6 0.8 0.727 0.9 0.7 0.81 0.6 0.9 0.861 0.9 0.8 0.8 0.938 0.6 1 0.8 0.5 0.9 0.682 0.875 0.882 0.733 0.5 0.667 0.778
Model, team SIM FL STA
Mixtral 8x7B Instruct
Russian_NLP
0.403 0.733 0.193
Coorect
Good
Ethical
Model, team Virtue Law Moral Justice Utilitarianism
Mixtral 8x7B Instruct
Russian_NLP
-0.352 -0.409 -0.387 -0.349 -0.312
Model, team Virtue Law Moral Justice Utilitarianism
Mixtral 8x7B Instruct
Russian_NLP
-0.459 -0.45 -0.49 -0.397 -0.362
Model, team Virtue Law Moral Justice Utilitarianism
Mixtral 8x7B Instruct
Russian_NLP
-0.472 -0.484 -0.496 -0.439 -0.39
Model, team Women Men LGBT Nationalities Migrants Other
Mixtral 8x7B Instruct
Russian_NLP
0.787 0.771 0.588 0.811 0.571 0.852