The table will scroll to the left
Task name | Result | Metric |
---|---|---|
LCS | 0.078 | Accuracy |
RCB | 0.523 / 0.503 | Avg. F1 / Accuracy |
USE | 0.04 | Grade Norm |
RWSD | 0.654 | Accuracy |
PARus | 0.828 | Accuracy |
ruTiE | 0.7 | Accuracy |
MultiQ | 0.205 / 0.097 | F1-score/EM |
CheGeKa | 0.206 / 0.139 | F1 / EM |
ruModAr | 0.459 | EM |
ruMultiAr | 0.2 | EM |
MathLogicQA | 0.396 | Accuracy |
ruWorldTree | 0.884 / 0.884 | Avg. F1 / Accuracy |
ruOpenBookQA | 0.825 / 0.824 | Avg. F1 / Accuracy |
The table will scroll to the left
Task name | Result | Metric | ||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
BPS | 0.359 | Accuracy | ||||||||||||||||||||||||
ruMMLU | 0.698 | Accuracy | ||||||||||||||||||||||||
SimpleAr | 0.946 | EM | ||||||||||||||||||||||||
ruHumanEval | 0.013 / 0.067 / 0.134 | pass@k | ||||||||||||||||||||||||
ruHHH |
0.702
|
Accuracy | ||||||||||||||||||||||||
ruHateSpeech |
0.747
|
Accuracy | ||||||||||||||||||||||||
ruDetox |
|
Overall average score (J) Assessment of the preservation of meaning (SIM) Assessment of naturalness (FL) Style Transfer Accuracy (STA) |
||||||||||||||||||||||||
ruEthics |
Table results:
[[-0.349, -0.374
, -0.374, -0.343
, -0.297], |
5 MCC |
Russian_NLP
SOLAR 10.7B Instruct
SOLAR 10.7B Instruct is the instructed version of SOLAR-10.7B, an advanced large language model (LLM) with 10.7 billion parameters, demonstrating superior performance in various natural language processing (NLP) tasks.
State-of-the-art instruction fine-tuning methods including supervised fine-tuning (SFT) and direct preference optimization (DPO).
SOLAR 10.7B, a large language model (LLM) with 10.7 billion parameters, demonstrating superior performance in various natural language processing (NLP) tasks.
The following datasets were used: - c-s-ale/alpaca-gpt4-data (SFT) - Open-Orca/OpenOrca (SFT) - in-house generated data utilizing Metamath(SFT, DPO) - Intel/orca_dpo_pairs (DPO) - allenai/ultrafeedback_binarized_cleaned (DPO)
cc-by-nc-4.0
Code version v.1.1.0 All the parameters were not changed and are used as prepared by the organizers. Details: - 1 x NVIDIA A100 - dtype auto - Pytorch 2.1.2 + CUDA 12.1 - Transformers 4.36.2 - Context length 4096