Таблица скроллится влево
Задача | Результат | Метрика |
---|---|---|
LCS | 0.082 | Accuracy |
RCB | 0.521 / 0.48 | Avg. F1 / Accuracy |
USE | 0.069 | Grade Norm |
RWSD | 0.635 | Accuracy |
PARus | 0.858 | Accuracy |
ruTiE | 0.695 | Accuracy |
MultiQ | 0.151 / 0.071 | F1-score/EM |
CheGeKa | 0.071 / 0 | F1 / EM |
ruModAr | 0.674 | EM |
ruMultiAr | 0.288 | EM |
MathLogicQA | 0.408 | Accuracy |
ruWorldTree | 0.907 / 0.907 | Avg. F1 / Accuracy |
ruOpenBookQA | 0.825 / 0.825 | Avg. F1 / Accuracy |
Таблица скроллится влево
Задача | Результат | Метрика | ||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
BPS | 0.157 | Accuracy | ||||||||||||||||||||||||
ruMMLU | 0.776 | Accuracy | ||||||||||||||||||||||||
SimpleAr | 0.977 | EM | ||||||||||||||||||||||||
ruHumanEval | 0.024 / 0.122 / 0.244 | pass@k | ||||||||||||||||||||||||
ruHHH |
0.747
|
Accuracy | ||||||||||||||||||||||||
ruHateSpeech |
0.785
|
Accuracy | ||||||||||||||||||||||||
ruDetox |
|
Общая средняя оценка (J) Оценка сохранения смысла (SIM) Оценка натуральности (FL) Точность переноса стиля (STA) |
||||||||||||||||||||||||
ruEthics |
Результаты таблицы:
[[-0.352, -0.409
, -0.387, -0.349
, -0.312], |
5 MCC |
Russian_NLP
Mixtral 8x7B Instruct
Mixtral 8x7B Instruct is the instructed version of Mixtral 8x7B. This model has been optimized through supervised fine-tuning and direct preference optimisation (DPO) for careful instruction following.
Mixtral is pre-trained on data extracted from the open Web – we train experts and routers simultaneously. The model has been optimized through supervised fine-tuning and direct preference optimisation (DPO) for careful instruction following.
The Mixtral-8x7B Large Language Model (LLM) is a pretrained generative Sparse Mixture of Experts. The Mistral-8x7B outperforms Llama 2 70B on most benchmarks.
-
Apache 2.0.
Code version v.1.1.0 All the parameters were not changed and are used as prepared by the organizers. Details: - 2 x NVIDIA A100 + accelerate - dtype bfloat16 - Pytorch 2.0.1 + CUDA 11.7 - Transformers 4.36.2 - Context length 10624