Yi 34B 200K

Created at 03.02.2024 14:05

Assessment of the main tasks: 0.455

The submission does not contain all the required tasks

The table will scroll to the left

Task name Result Metric
LCS 0.108 Accuracy
RCB 0.498 / 0.402 Avg. F1 / Accuracy
USE 0.049 Grade Norm
RWSD 0.562 Accuracy
PARus 0.74 Accuracy
ruTiE 0.602 Accuracy
MultiQ 0.185 / 0.107 F1-score/EM
CheGeKa 0.01 / 0 F1 / EM
ruModAr 0.635 EM
ruMultiAr 0.277 EM
MathLogicQA 0.473 Accuracy
ruWorldTree 0.838 / 0.838 Avg. F1 / Accuracy
ruOpenBookQA 0.748 / 0.746 Avg. F1 / Accuracy

Evaluation on open tasks:

It is not taken into account in the overall rating

The table will scroll to the left

Task name Result Metric
BPS 0.426 Accuracy
ruMMLU 0.676 Accuracy
SimpleAr 0.981 EM
ruHumanEval 0.004 / 0.021 / 0.043 pass@k
ruHHH

0.601

  • Honest: 0.607
  • Harmless: 0.586
  • Helpful: 0.61
Accuracy
ruHateSpeech

0.626

  • Women : 0.657
  • Man : 0.629
  • LGBT : 0.706
  • Nationality : 0.703
  • Migrants : 0.429
  • Other : 0.525
Accuracy
ruDetox
  • 0.161
  • 0.433
  • 0.636
  • 0.379

Overall average score (J)

Assessment of the preservation of meaning (SIM)

Assessment of naturalness (FL)

Style Transfer Accuracy (STA)

ruEthics
Correct God Ethical
Virtue -0.12 -0.199 -0.161
Law -0.113 -0.144 -0.145
Moral -0.108 -0.164 -0.132
Justice -0.125 -0.153 -0.159
Utilitarianism -0.082 -0.154 -0.113

Table results:

[[-0.12, -0.113 , -0.108, -0.125 , -0.082],
[-0.199, -0.144 , -0.164, -0.153 , -0.154],
[-0.161, -0.145 , -0.132, -0.159 , -0.113]]

5 MCC

Information about the submission:

Team:

LM Research

Name of the ML model:

Yi 34B 200K

Additional links:

https://github.com/01-ai/Yi

Architecture description:

The Yi 34B follow the same model architecture as LLaMA with a 200k context window size.

Description of the training:

Yi has independently created its own efficient training pipelines, and robust training infrastructure entirely from the ground up.

Pretrain data:

Trained on 3T multilingual corpus.

Training Details:

Yi has independently created its own high-quality training datasets, efficient training pipelines, and robust training infrastructure entirely from the ground up.

License:

Apache 2.0 license

Strategy, generation and parameters:

Code version v.1.1.0 All the parameters were not changed and are used as prepared by the organizers. Details: - 2 x NVIDIA A100 - dtype float16 - Pytorch 2.1.2 + CUDA 12.1 - Transformers 4.36.2 - Context length 11000