ruGPT-3.5 13B

Создан 12.01.2024 11:18

Общая оценка: 0.208

Таблица скроллится влево

Задача Результат Метрика
BPS 0.492 Accuracy
LCS 0.132 Accuracy
RCB 0.331 / 0.194 Avg. F1 / Accuracy
USE 0.025 Grade Norm
RWSD 0.523 Accuracy
PARus 0.504 Accuracy
ruTiE 0.488 Accuracy
MultiQ 0.115 / 0.036 F1-score/EM
ruMMLU 0.246 Accuracy
CheGeKa 0.037 / 0 F1 / EM
ruModAr 0.001 Accuracy
SimpleAr 0.029 Accuracy
ruMultiAr 0.025 Accuracy
MathLogicQA 0.258 Accuracy
ruHumanEval 0.001 / 0.003 / 0.006 pass@k
ruWorldTree 0.246 / 0.22 Avg. F1 / Accuracy
ruOpenBookQA 0.223 / 0.208 Avg. F1 / Accuracy

Оценка на диагностических датасетах:

Не учитывается в общем рейтинге

Таблица скроллится влево

Задача Результат Метрика
ruHHH

0.472

  • Honest: 0.475
  • Harmless: 0.466
  • Helpful: 0.475
Accuracy
ruHateSpeech

0.543

  • Женщины : 0.537
  • Мужчины : 0.657
  • ЛГБТ : 0.647
  • Национальность : 0.514
  • Мигранты : 0.286
  • Другое : 0.508
Accuracy
ruDetox
  • 0.286
  • 0.562
  • 0.704
  • 0.678

Общая средняя оценка (J)

Оценка сохранения смысла (SIM)

Оценка натуральности (FL)

Точность переноса стиля (STA)

ruEthics
Правильно Хорошо Этично
Добродетель -0.036 0.045 0.034
Закон -0.023 0.035 -0.021
Мораль -0.025 0.034 0.029
Справедливость -0.017 0.045 0.049
Утилитаризм -0.016 0.04 0.067

Результаты таблицы:

[[-0.036, -0.023 , -0.025, -0.017 , -0.016],
[0.045, 0.035 , 0.034, 0.045 , 0.04],
[0.034, -0.021 , 0.029, 0.049 , 0.067]]

5 MCC

Информация о сабмите:

Команда:

MERA

Название ML-модели:

ruGPT-3.5 13B

Ссылка на ML-модель:

https://huggingface.co/ai-forever/ruGPT-3.5-13B

Дополнительные ссылки:

https://habr.com/ru/companies/sberbank/articles/746736/

Описание архитектуры:

ruGPT-3 is a Russian counterpart of GPT-3 (Brown et al., 2020). Model has 13B parameters. This is the biggest model so far and it was used for training first version of GigaChat.

Описание обучения:

Model was trained using Deepspeed and Megatron libraries, on 300B tokens dataset for 3 epochs, around 45 days on 512 V100. After that model was finetuned 1 epoch with sequence length 2048 around 20 days on 200 GPU A100 on additional data (see above).

Данные претрейна:

Model was pretrained on a 300Gb of various domains, than additionaly trained on the 100 Gb of code and legal documents. Training data was deduplicated, the text deduplication includes 64-bit hashing of each text in the corpus for keeping texts with a unique hash. We also filter the documents based on their text compression rate using zlib4. The most strongly and weakly compressing deduplicated texts are discarded.

Детали обучения:

After the final training perplexity for this model was around 8.8 for Russian.

Лицензия:

MIT

Стратегия, генерация и параметры:

Code version v.1.1.0 All the parameters were not changed and are used as prepared by the organizers. Details: - 1 x NVIDIA A100 - dtype auto - Pytorch 2.1.2 + CUDA 12.1 - Transformers 4.36.2 - Context length 2048