The table will scroll to the left
Task name | Result | Metric |
---|---|---|
LCS | 0.08 | Accuracy |
RCB | 0.466 / 0.424 | Avg. F1 / Accuracy |
USE | 0.031 | Grade Norm |
RWSD | 0.5 | Accuracy |
PARus | 0.744 | Accuracy |
ruTiE | 0.453 | Accuracy |
MultiQ | 0.185 / 0.041 | F1-score/EM |
CheGeKa | 0.076 / 0 | F1 / EM |
ruModAr | 0.65 | EM |
ruMultiAr | 0.216 | EM |
MathLogicQA | 0.388 | Accuracy |
ruWorldTree | 0.914 / 0.915 | Avg. F1 / Accuracy |
ruOpenBookQA | 0.818 / 0.817 | Avg. F1 / Accuracy |
The table will scroll to the left
Task name | Result | Metric | ||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
BPS | 0.495 | Accuracy | ||||||||||||||||||||||||
ruMMLU | 0.741 | Accuracy | ||||||||||||||||||||||||
SimpleAr | 0.965 | EM | ||||||||||||||||||||||||
ruHumanEval | 0.02 / 0.101 / 0.201 | pass@k | ||||||||||||||||||||||||
ruHHH |
0.573
|
Accuracy | ||||||||||||||||||||||||
ruHateSpeech |
0.585
|
Accuracy | ||||||||||||||||||||||||
ruDetox |
|
Overall average score (J) Assessment of the preservation of meaning (SIM) Assessment of naturalness (FL) Style Transfer Accuracy (STA) |
||||||||||||||||||||||||
ruEthics |
Table results:
[[-0.113, -0.124
, -0.151, -0.065
, -0.076], |
5 MCC |
NLP Team
Llama 2 70b
Llama 2 is an auto-regressive language model that uses an optimized transformer architecture. Number of parameters 70b.
Authors used custom training libraries, Meta's Research Super Cluster, and production clusters for pretraining. Fine-tuning, annotation, and evaluation were also performed on third-party cloud compute. 1720320 GPU hours.
Llama 2 was pretrained on 2 trillion tokens of data from publicly available sources. Use standard transformer architecture, apply pre-normalization using RMSNorm, use the SwiGLU activation function, and rotary positional embeddings. The primary architectural differences from Llama 1 include increased context length and grouped-query attention (GQA).
Token counts refer to pretraining data only. All models are trained with a global batch-size of 4M tokens.
A custom commercial license is available at: https://ai.meta.com/resources/models-and-libraries/llama-downloads/
Code version v.1.1.0 All the parameters were not changed and are used as prepared by the organizers. Details: - 4 x NVIDIA A100 + accelerate - dtype float16 - Pytorch 2.0.1 + CUDA 11.7 - Transformers 4.36.2 - Context length 4096