Qwen2.5-1.5B-Instruct

MERA Создан 04.03.2026 14:13

0.178

Общий результат

Оценки по задачам лидерборда

Таблица скроллится влево

Задача	Результат	Метрика
YABLoCo	0.043 / 0.01	EM pass@k
stRuCom	0.16	chrF
RealCode	0.004 / 0.955	pass@k execution_success
UnitTests	0.088	CodeBLEU
ruCodeEval	0.006 / 0.026 / 0.043	pass@k
JavaTestGen	0.044 / 0.273	pass@k compile@1
ruHumanEval	0.007 / 0.024 / 0.037	pass@k
RealCodeJava	0.087 / 0.973	pass@k execution_success
CodeLinterEval	0.403 / 0.566 / 0.6	pass@k
ruCodeReviewer	0.014 / 0.123 / 0 / 0 / 0	chrF BLEU judge@1 judge@5 judge@10
CodeCorrectness	0.837	EM

Информация о сабмите

Версия MERA

v1.0.0

Версия Torch

2.9.1

Версия кодовой базы

6aae2a5

Версия CUDA

12.8

Precision весов модели

bfloat16

Сид

1234

Батч

Версия transformers

4.57.6

Количество GPU и их тип

1 x NVIDIA A100-SXM4-80GB

Архитектура

vllm

Команда:

MERA

Название ML-модели:

Qwen2.5-1.5B-Instruct

Ссылка на ML-модель:

https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct

Размер модели

1.5B

Тип модели:

Открытая

SFT

Дополнительные ссылки:

https://qwenlm.github.io/blog/qwen2.5/ https://arxiv.org/pdf/2412.15115

Описание архитектуры:

Qwen 2.5 is the new generation of the QWEN model series. It has significantly more knowledge and has greatly improved capabilities in coding and mathematics

Описание обучения:

Qwen 2.5 pre-training process consists of several key components. The authors carefully curate high-quality training data through sophisticated filtering and scoring mechanisms, combined with strategic data mixture. Second, they conduct extensive research on hyperparameter optimization to effectively train models at various scales. Finally, they incorporate specialized long-context pre-training to enhance the model’s ability to process and understand extended sequences. Then SFT is performed as well as multistage reinforcement learning.

Данные претрейна:

Pre-training: the high-quality pre-training datasets of 18 trillion tokens, SFT with with over 1 million samples

Лицензия:

apache-2.0

Параметры инференса

Параметры генерации:
rucodeeval - do_sample=true;temperature=0.6;max_gen_toks=1024;until=["\nclass","\ndef","\n#","\nif","\nprint"]; \ncodelintereval - do_sample=true;temperature=0.6;max_gen_toks=1024;until=["\n\n"]; \nrucodereviewer - temperature=0;do_sample=false;max_gen_toks=1000;until=["\n\n"]; \nruhumaneval - do_sample=true;temperature=0.6;max_gen_toks=1024;until=["\nclass","\ndef","\n#","\nif","\nprint"]; \nstrucom - do_sample=false;max_gen_toks=512;until=["\n\n"]; \nunittests - do_sample=false;max_gen_toks=1024;until=["\n\n"]; \ncodecorrectness - until=["\n\n"];do_sample=false;temperature=0; \nrealcode - do_sample=true;max_gen_toks=4096;temperature=0.7;repetition_penalty=1.05;top_p=0.8;until=["<|endoftext|>","<|im_end|>"]; \nrealcodejava - do_sample=true;max_gen_toks=4096;temperature=0.7;repetition_penalty=1.05;top_p=0.8;until=["<|endoftext|>","<|im_end|>"]; \njavatestgen - do_sample=true;max_gen_toks=4096;temperature=0.2;top_p=0.9;until=["<|endoftext|>","<|im_end|>"]; \nyabloco_oracle - max_gen_toks=2048;do_sample=false;until=["<|endoftext|>","<|im_end|>","\n\n\n","\\sclass\\s","\\sdef\\s","^def\\s","^class\\s","^if\\s","@","^#"];

Размер контекста:
rucodeeval, codelintereval, rucodereviewer, ruhumaneval, strucom, unittests, codecorrectness, realcode, realcodejava, javatestgen, yabloco_oracle - 32768

Описание темплейта:
{%- if tools %} \n {{- '<|im_start|>system\n' }} \n {%- if messages[0]['role'] == 'system' %} \n {{- messages[0]['content'] }} \n {%- else %} \n {{- 'You are Qwen, created by Alibaba Cloud. You are a helpful assistant.' }} \n {%- endif %} \n {{- "\n\n# Tools\n\nYou may call one or more functions to assist with the user query.\n\nYou are provided with function signatures within <tools></tools> XML tags:\n<tools>" }} \n {%- for tool in tools %} \n {{- "\n" }} \n {{- tool | tojson }} \n {%- endfor %} \n {{- "\n</tools>\n\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\n<tool_call>\n{\"name\": <function-name>, \"arguments\": <args-json-object>}\n</tool_call><|im_end|>\n" }} \n{%- else %} \n {%- if messages[0]['role'] == 'system' %} \n {{- '<|im_start|>system\n' + messages[0]['content'] + '<|im_end|>\n' }} \n {%- else %} \n {{- '<|im_start|>system\nYou are Qwen, created by Alibaba Cloud. You are a helpful assistant.<|im_end|>\n' }} \n {%- endif %} \n{%- endif %} \n{%- for message in messages %} \n {%- if (message.role == "user") or (message.role == "system" and not loop.first) or (message.role == "assistant" and not message.tool_calls) %} \n {{- '<|im_start|>' + message.role + '\n' + message.content + '<|im_end|>' + '\n' }} \n {%- elif message.role == "assistant" %} \n {{- '<|im_start|>' + message.role }} \n {%- if message.content %} \n {{- '\n' + message.content }} \n {%- endif %} \n {%- for tool_call in message.tool_calls %} \n {%- if tool_call.function is defined %} \n {%- set tool_call = tool_call.function %} \n {%- endif %} \n {{- '\n<tool_call>\n{"name": "' }} \n {{- tool_call.name }} \n {{- '", "arguments": ' }} \n {{- tool_call.arguments | tojson }} \n {{- '}\n</tool_call>' }} \n {%- endfor %} \n {{- '<|im_end|>\n' }} \n {%- elif message.role == "tool" %} \n {%- if (loop.index0 == 0) or (messages[loop.index0 - 1].role != "tool") %} \n {{- '<|im_start|>user' }} \n {%- endif %} \n {{- '\n<tool_response>\n' }} \n {{- message.content }} \n {{- '\n</tool_response>' }} \n {%- if loop.last or (messages[loop.index0 + 1].role != "tool") %} \n {{- '<|im_end|>\n' }} \n {%- endif %} \n {%- endif %} \n{%- endfor %} \n{%- if add_generation_prompt %} \n {{- '<|im_start|>assistant\n' }} \n{%- endif %}

Qwen2.5-1.5B-Instruct

Оценки по задачам лидерборда

Информация о сабмите

Команда:

Название ML-модели:

Ссылка на ML-модель:

Размер модели

Тип модели:

Дополнительные ссылки:

Описание архитектуры:

Описание обучения:

Данные претрейна:

Лицензия:

Параметры инференса

Подтвердите удаление сабмита