Таблица скроллится влево
Задача | Результат | Метрика |
---|---|---|
LCS | 0.09 | Accuracy |
RCB | 0.329 / 0.258 | Avg. F1 / Accuracy |
USE | 0.01 | Grade Norm |
RWSD | 0.5 | Accuracy |
PARus | 0.478 | Accuracy |
ruTiE | 0.493 | Accuracy |
MultiQ | 0.098 / 0.014 | F1-score/EM |
CheGeKa | 0.043 / 0 | F1 / EM |
ruModAr | 0.486 | EM |
ruMultiAr | 0.156 | EM |
MathLogicQA | 0.314 | Accuracy |
ruWorldTree | 0.703 / 0.703 | Avg. F1 / Accuracy |
ruOpenBookQA | 0.638 / 0.637 | Avg. F1 / Accuracy |
Таблица скроллится влево
Задача | Результат | Метрика | ||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
BPS | 0.507 | Accuracy | ||||||||||||||||||||||||
ruMMLU | 0.563 | Accuracy | ||||||||||||||||||||||||
SimpleAr | 0.911 | EM | ||||||||||||||||||||||||
ruHumanEval | 0.008 / 0.04 / 0.079 | pass@k | ||||||||||||||||||||||||
ruHHH |
0.466
|
Accuracy | ||||||||||||||||||||||||
ruHateSpeech |
0.581
|
Accuracy | ||||||||||||||||||||||||
ruDetox |
|
Общая средняя оценка (J) Оценка сохранения смысла (SIM) Оценка натуральности (FL) Точность переноса стиля (STA) |
||||||||||||||||||||||||
ruEthics |
Результаты таблицы:
[[-0.102, -0.076
, -0.132, -0.122
, -0.142], |
5 MCC |
MERA
Llama 2 13B
Llama 2 is an auto-regressive language model that uses an optimized transformer architecture. Number of parameters 13b.
Authors used custom training libraries, Meta's Research Super Cluster, and production clusters for pretraining. Fine-tuning, annotation, and evaluation were also performed on third-party cloud compute. 368640 GPU hours.
Llama 2 was pretrained on 2 trillion tokens of data from publicly available sources.
Token counts refer to pretraining data only. All models are trained with a global batch-size of 4M tokens.
A custom commercial license is available at: https://ai.meta.com/resources/models-and-libraries/llama-downloads/
Code version v.1.1.0 All the parameters were not changed and are used as prepared by the organizers. Details: - 1 x NVIDIA A100 - dtype auto - Pytorch 2.1.2 + CUDA 12.1 - Transformers 4.36.2 - Context length 4096