Cotype (preview)

MTS AI Created at 10.12.2024 22:24
0.633
The overall result
43
Place in the rating
In the top by tasks:
9
MultiQ
The task is one of the main ones
5
ruMMLU
The result on the task is higher than human
Weak tasks:
54
RWSD
52
PARus
90
RCB
72
ruEthics
56
ruWorldTree
34
ruOpenBookQA
143
CheGeKa
57
ruHateSpeech
38
ruDetox
70
ruHHH
41
ruTiE
116
ruHumanEval
50
USE
56
MathLogicQA
64
ruMultiAr
55
SimpleAr
30
LCS
24
BPS
36
ruModAr
62
MaMuRAMu
91
ruCodeEval
+17
Hide

Ratings for leaderboard tasks

The table will scroll to the left

Task name Result Metric
LCS 0.322 Accuracy
RCB 0.568 / 0.555 Accuracy F1 macro
USE 0.362 Grade norm
RWSD 0.646 Accuracy
PARus 0.928 Accuracy
ruTiE 0.87 Accuracy
MultiQ 0.644 / 0.499 F1 Exact match
CheGeKa 0.202 / 0.156 F1 Exact match
ruModAr 0.898 Exact match
MaMuRAMu 0.827 Accuracy
ruMultiAr 0.414 Exact match
ruCodeEval 0.219 / 0.312 / 0.341 Pass@k
MathLogicQA 0.705 Accuracy
ruWorldTree 0.981 / 0.981 Accuracy F1 macro
ruOpenBookQA 0.94 / 0.94 Accuracy F1 macro

Evaluation on open tasks:

Go to the ratings by subcategory

The table will scroll to the left

Task name Result Metric
BPS 0.997 Accuracy
ruMMLU 0.903 Accuracy
SimpleAr 0.995 Exact match
ruHumanEval 0.177 / 0.261 / 0.293 Pass@k
ruHHH 0.848
ruHateSpeech 0.838
ruDetox 0.339
ruEthics
Correct God Ethical
Virtue 0.401 0.398 0.465
Law 0.397 0.396 0.448
Moral 0.421 0.428 0.497
Justice 0.357 0.357 0.419
Utilitarianism 0.35 0.345 0.416

Information about the submission

Mera version
v.1.2.0
Torch Version
2.5.1
The version of the codebase
9cf05b2
CUDA version
12.4
Precision of the model weights
bfloat16
Seed
1234
Batch
1
Transformers version
4.46.3
The number of GPUs and their type
4 x NVIDIA A100-SXM4-40GB
Architecture
vllm

Team:

MTS AI

Name of the ML model:

Cotype (preview)

Model type:

Closed

Architecture description:

Cotype (preview) — это экспериментальная языковая модель от команды MTS AI, ориентированная на потребности корпоративного сегмента. На данный момент модель ещё не доступна широкой аудитории, однако вы можете получить дополнительную информацию о продуктах MTS AI и обсудить возможности сотрудничества на сайте: https://mts.ai/ru/product/generative-ai-solutions/.

Description of the training:

-

Pretrain data:

-

License:

MTS AI Cotype

Inference parameters

Generation Parameters:
simplear - do_sample=false;until=["\n"]; \nchegeka - do_sample=false;until=["\n"]; \nrudetox - do_sample=false;until=["\n"]; \nrumultiar - do_sample=false;until=["\n"]; \nmultiq - do_sample=false;until=["\n"]; \nrumodar - do_sample=false;until=["\n"]; \nruhumaneval - do_sample=true;until=["\nclass","\ndef","\n#","\nif","\nprint"];temperature=0.6; \nrucodeeval - do_sample=true;until=["\nclass","\ndef","\n#","\nif","\nprint"];temperature=0.6; \nuse - do_sample=false;until=["\n","."];

The size of the context:
32768

System prompt:
Решай задачу строго по инструкции. Только ответ, без объяснений. Числовой ответ — только число. Буква, цифра или слово — только их. Выбор варианта ответа — одна буква или цифра. Ответ должен быть точным, без лишних символов или слов. В случае, если нужно сгенерировать код на Python — твоим ответом должен быть только код (продолжения кода из инструкции), не повторяй имя функции, не давай пояснений, не пиши комментариев, не используй input, пиши код так, чтобы он дополнял функцию из инструкции (с соответствующими отступами) и всегда начинай написание кода с переноса строки!

Ratings by subcategory

Metric: Grade Norm
Model, team 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 8_0 8_1 8_2 8_3 8_4
Cotype (preview)
MTS AI
0.567 0.533 0.867 0.167 0.267 0.567 0.167 - 0.067 0.067 0.1 0.067 0.3 0.1 0.2 0.333 0.033 0.033 0 0.067 0.1 0.7 0.533 0.333 0.267 0.767 0.3 0.3 0.633 0.6 0.633
Model, team Honest Helpful Harmless
Cotype (preview)
MTS AI
0.852 0.797 0.897
Model, team Anatomy Virology Astronomy Marketing Nutrition Sociology Management Philosophy Prehistory Human aging Econometrics Formal logic Global facts Jurisprudence Miscellaneous Moral disputes Business ethics Biology (college) Physics (college) Human Sexuality Moral scenarios World religions Abstract algebra Medicine (college) Machine learning Medical genetics Professional law PR Security studies Chemistry (школьная) Computer security International law Logical fallacies Politics Clinical knowledge Conceptual_physics Math (college) Biology (high school) Physics (high school) Chemistry (high school) Geography (high school) Professional medicine Electrical engineering Elementary mathematics Psychology (high school) Statistics (high school) History (high school) Math (high school) Professional accounting Professional psychology Computer science (college) World history (high school) Macroeconomics Microeconomics Computer science (high school) European history Government and politics
Cotype (preview)
MTS AI
0.904 0.753 0.987 0.966 0.954 0.945 0.893 0.923 0.96 0.915 0.877 0.841 0.87 0.935 0.934 0.899 0.87 0.965 0.867 0.939 0.848 0.936 0.9 0.936 0.92 0.97 0.738 0.907 0.935 0.78 0.92 0.95 0.902 0.98 0.943 0.953 0.92 0.981 0.907 0.897 0.955 0.963 0.855 0.939 0.961 0.917 0.99 0.893 0.858 0.921 0.9 0.962 0.938 0.966 0.99 0.952 0.943
Model, team SIM FL STA
Cotype (preview)
MTS AI
0.671 0.696 0.759
Model, team Anatomy Virology Astronomy Marketing Nutrition Sociology Managment Philosophy Pre-History Gerontology Econometrics Formal logic Global facts Jurisprudence Miscellaneous Moral disputes Business ethics Bilology (college) Physics (college) Human sexuality Moral scenarios World religions Abstract algebra Medicine (college) Machine Learning Genetics Professional law PR Security Chemistry (college) Computer security International law Logical fallacies Politics Clinical knowledge Conceptual physics Math (college) Biology (high school) Physics (high school) Chemistry (high school) Geography (high school) Professional medicine Electrical Engineering Elementary mathematics Psychology (high school) Statistics (high school) History (high school) Math (high school) Professional Accounting Professional psychology Computer science (college) World history (high school) Macroeconomics Microeconomics Computer science (high school) Europe History Government and politics
Cotype (preview)
MTS AI
0.622 0.921 0.783 0.694 0.908 0.828 0.759 0.737 0.827 0.785 0.808 0.817 0.575 0.791 0.789 0.753 0.804 0.822 0.754 0.877 0.895 0.915 0.911 0.87 0.867 0.864 0.808 0.667 0.947 0.822 0.844 0.936 0.893 0.93 0.742 0.821 0.889 0.867 0.737 0.785 0.873 0.889 0.822 1 0.914 0.911 0.897 0.955 0.892 0.93 0.911 0.855 0.861 0.805 0.628 0.766 0.878
Coorect
Good
Ethical
Model, team Virtue Law Moral Justice Utilitarianism
Cotype (preview)
MTS AI
0.401 0.397 0.421 0.357 0.35
Model, team Virtue Law Moral Justice Utilitarianism
Cotype (preview)
MTS AI
0.398 0.396 0.428 0.357 0.345
Model, team Virtue Law Moral Justice Utilitarianism
Cotype (preview)
MTS AI
0.465 0.448 0.497 0.419 0.416
Model, team Women Men LGBT Nationalities Migrants Other
Cotype (preview)
MTS AI
0.87 0.686 0.882 0.757 0.857 0.902