Starling-LM-7B-alpha

BODBE LLM Created at 22.02.2024 13:17
0.433
The overall result
The submission does not contain all the required tasks

Ratings for leaderboard tasks

The table will scroll to the left

Task name Result Metric
LCS 0.082 Accuracy
RCB 0.532 / 0.485 Accuracy F1 macro
USE 0.066 Grade norm
RWSD 0.573 Accuracy
PARus 0.724 Accuracy
ruTiE 0.549 Accuracy
MultiQ 0.18 / 0.002 F1 Exact match
CheGeKa 0.029 / 0 F1 Exact match
ruModAr 0.473 Exact match
ruMultiAr 0.227 Exact match
MathLogicQA 0.374 Accuracy
ruWorldTree 0.829 / 0.829 Accuracy F1 macro
ruOpenBookQA 0.775 / 0.774 Accuracy F1 macro

Evaluation on open tasks:

Go to the ratings by subcategory

The table will scroll to the left

Task name Result Metric
BPS 0.374 Accuracy
ruMMLU 0.673 Accuracy
SimpleAr 0.941 Exact match
ruHumanEval 0.015 / 0.076 / 0.152 Pass@k
ruHHH 0.742
ruHateSpeech 0.691
ruDetox 0.138
ruEthics
Correct God Ethical
Virtue -0.328 -0.316 -0.398
Law -0.353 -0.329 -0.402
Moral -0.336 -0.336 -0.387
Justice -0.294 -0.274 -0.366
Utilitarianism -0.248 -0.234 -0.316

Information about the submission:

Mera version
-
Torch Version
-
The version of the codebase
-
CUDA version
-
Precision of the model weights
-
Seed
-
Butch
-
Transformers version
-
The number of GPUs and their type
-
Architecture
-

Team:

BODBE LLM

Name of the ML model:

Starling-LM-7B-alpha

Additional links:

https://huggingface.co/berkeley-nest/Starling-LM-7B-alpha

Architecture description:

Для создания Starling-7B использовалось обучения с подкреплением на основе обратной связи ИИ (RLAIF). Модель использует возможности нового набора данных ранжирования с метками GPT-4, berkeley-nest/Nectar, а также нового процесса обучения и настройки политики вознаграждения.

Description of the training:

Finetuned от Openchat 3.5 (базируется на Mistral-7B-v0.1)

Pretrain data:

ранжирующий набор данных Nectar

Training Details:

обучение с подкреплением на основе обратной связи ИИ (RLAIF) https://arxiv.org/abs/2306.02231

License:

Набор данных и модель предназначены только для некоммерческого использования, в соответствии с лицензией LLaMA(https://github.com/facebookresearch/llama/blob/main/MODEL_CARD.md) на дистилляцию данных, условиями использования данных (https://openai.com/policies/terms-of-use), созданных с использованием сервисов OpenAI, и правилами конфиденциальности (https://chrome.google.com/webstore/detail/sharegpt-share-your-chatg/daiacboceoaocpibfodeljbdfacokfjb) ShareGPT.

Strategy, generation and parameters:

MERA v.1.1.0 LM-Harness 0.3.0 Фреймворки: torch 2.1.0 + Cuda 12.1 max length: на задании rutie - 6169 tokens, на остальных без ограничений

Expand information

Ratings by subcategory

Metric: Accuracy
Model, team Honest Helpful Harmless
Starling-LM-7B-alpha
BODBE LLM
0.705 0.695 0.828
Model, team Anatomy Virology Astronomy Marketing Nutrition Sociology Management Philosophy Prehistory Human aging Econometrics Formal logic Global facts Jurisprudence Miscellaneous Moral disputes Business ethics Biology (college) Physics (college) Human Sexuality Moral scenarios World religions Abstract algebra Medicine (college) Machine learning Medical genetics Professional law PR Security studies Chemistry (школьная) Computer security International law Logical fallacies Politics Clinical knowledge Conceptual_physics Math (college) Biology (high school) Physics (high school) Chemistry (high school) Geography (high school) Professional medicine Electrical engineering Elementary mathematics Psychology (high school) Statistics (high school) History (high school) Math (high school) Professional accounting Professional psychology Computer science (college) World history (high school) Macroeconomics Microeconomics Computer science (high school) European history Government and politics
Starling-LM-7B-alpha
BODBE LLM
0.7 0.625 0.7 0.657 0.714 0.7 0.667 0.588 0.5 0.8 0.727 0.6 0.7 0.577 0.5 0.5 0.7 0.63 0.6 1 0.2 0.75 0.6 0.647 0.7 0.909 0.75 0.786 1 0.727 0.3 0.611 0.6 0.8 0.727 0.9 0.7 0.667 0.5 0.6 0.785 0.8 0.7 0.4 0.75 0.8 0.9 0.3 0.4 0.8 0.591 0.813 0.853 0.8 0.625 0.394 0.667
Model, team SIM FL STA
Starling-LM-7B-alpha
BODBE LLM
0.479 0.615 0.357
Coorect
Good
Ethical
Model, team Virtue Law Moral Justice Utilitarianism
Starling-LM-7B-alpha
BODBE LLM
-0.328 -0.353 -0.336 -0.294 -0.248
Model, team Virtue Law Moral Justice Utilitarianism
Starling-LM-7B-alpha
BODBE LLM
-0.316 -0.329 -0.336 -0.274 -0.234
Model, team Virtue Law Moral Justice Utilitarianism
Starling-LM-7B-alpha
BODBE LLM
-0.398 -0.402 -0.387 -0.366 -0.316
Model, team Women Men LGBT Nationalities Migrants Other
Starling-LM-7B-alpha
BODBE LLM
0.694 0.771 0.647 0.649 0.429 0.705