Mistral 7B

Created at 12.01.2024 11:18

General assessment: 0.4

The table will scroll to the left

Task name Result Metric
BPS 0.392 Accuracy
LCS 0.098 Accuracy
RCB 0.372 / 0.344 Avg. F1 / Accuracy
USE 0.022 Grade Norm
RWSD 0.512 Accuracy
PARus 0.518 Accuracy
ruTiE 0.502 Accuracy
MultiQ 0.124 / 0.067 F1-score/EM
ruMMLU 0.676 Accuracy
CheGeKa 0.038 / 0 F1 / EM
ruModAr 0.516 Accuracy
SimpleAr 0.95 Accuracy
ruMultiAr 0.195 Accuracy
MathLogicQA 0.344 Accuracy
ruHumanEval 0.012 / 0.058 / 0.116 pass@k
ruWorldTree 0.81 / 0.811 Avg. F1 / Accuracy
ruOpenBookQA 0.735 / 0.732 Avg. F1 / Accuracy

Evaluation on diagnostic datasets:

It is not taken into account in the overall rating

The table will scroll to the left

Task name Result Metric
ruHHH

0.556

  • Honest: 0.541
  • Harmless: 0.586
  • Helpful: 0.542
Accuracy
ruHateSpeech

0.619

  • Women : 0.593
  • Man : 0.686
  • LGBT : 0.588
  • Nationality : 0.595
  • Migrants : 0.429
  • Other : 0.672
Accuracy
ruDetox
  • 0.375
  • 0.779
  • 0.594
  • 0.775

Overall average score (J)

Assessment of the preservation of meaning (SIM)

Assessment of naturalness (FL)

Style Transfer Accuracy (STA)

ruEthics
Correct God Ethical
Virtue -0.12 -0.065 -0.114
Law -0.091 -0.061 -0.115
Moral -0.114 -0.056 -0.122
Justice -0.141 -0.047 -0.104
Utilitarianism -0.129 -0.081 -0.089

Table results:

[[-0.12, -0.091 , -0.114, -0.141 , -0.129],
[-0.065, -0.061 , -0.056, -0.047 , -0.081],
[-0.114, -0.115 , -0.122, -0.104 , -0.089]]

5 MCC

Information about the submission:

Team:

MERA

Name of the ML model:

Mistral 7B

Additional links:

https://arxiv.org/abs/2310.06825

Architecture description:

The Mistral-7B-v0.1 Large Language Model (LLM) is a pretrained generative text model with 7 billion parameters. Mistral-7B-v0.1 outperforms Llama 2 13B on all benchmarks we tested.

Description of the training:

Mistral 7B leverages grouped-query attention (GQA), and sliding window attention (SWA). GQA significantly accelerates the inference speed, and also reduces the memory requirement during decoding, allowing for higher batch sizes hence higher throughput, a crucial factor for real-time applications. In addition, SWA is designed to handle longer sequences more effectively at a reduced computational cost, thereby alleviating a common limitation in LLMs. These attention mechanisms collectively contribute to the enhanced performance and efficiency of Mistral 7B.

Pretrain data:

-

Training Details:

Mistral-7B-v0.1 is a transformer model, with the following architecture choices: Grouped-Query Attention Sliding-Window Attention Byte-fallback BPE tokenizer.

License:

Apache 2.0 license

Strategy, generation and parameters:

Code version v.1.1.0 All the parameters were not changed and are used as prepared by the organizers. Details: - 1 x NVIDIA A100 - dtype auto - Pytorch 2.1.2 + CUDA 12.1 - Transformers 4.36.2 - Context length 11500