Qwen3-14B

MERA Created at 15.03.2026 21:06

0.409

The overall result

Ratings for leaderboard tasks

The table will scroll to the left

Task name	Result	Metric
YABLoCo	0.048 / 0.005	EM pass@k
stRuCom	0.291	chrF
RealCode	0.283 / 0.959	pass@k execution_success
UnitTests	0.3	CodeBLEU
ruCodeEval	0.674 / 0.717 / 0.732	pass@k
JavaTestGen	0.22 / 0.463	pass@k compile@1
ruHumanEval	0.775 / 0.8 / 0.805	pass@k
RealCodeJava	0.295 / 0.977	pass@k execution_success
CodeLinterEval	0.389 / 0.538 / 0.591	pass@k
ruCodeReviewer	0.041 / 0.2 / 0.07 / 0.167 / 0.216	chrF BLEU judge@1 judge@5 judge@10
CodeCorrectness	0.812	EM

Information about the submission

Mera version

v1.0.0

Torch Version

2.9.1

The version of the codebase

7c56310

CUDA version

12.8

Precision of the model weights

bfloat16

Seed

1234

Batch

Transformers version

4.56.1

The number of GPUs and their type

4 x NVIDIA A100-SXM4-80GB

Architecture

local-chat-completions

Team:

MERA

Name of the ML model:

Qwen3-14B

Link to the ML model:

https://huggingface.co/Qwen/Qwen3-14B

Model size

14.0B

Model type:

Opened

SFT

Additional links:

https://huggingface.co/Qwen/Qwen3-14B

Architecture description:

Qwen3-14B is a decoder-only transformer language model from the Qwen3 family with approximately 14 billion parameters. The model is designed for general language understanding and generation tasks, including reasoning, coding, and multilingual applications, and supports long-context inputs.

Description of the training:

The model follows the standard Qwen3 multi-stage training pipeline consisting of large-scale pretraining followed by post-training. Post-training includes instruction tuning and alignment techniques to improve instruction following, reasoning ability, and response quality.

Pretrain data:

The model was pretrained on a large multilingual corpus of approximately 36 trillion tokens covering 119 languages. It was further post-trained on instruction-following and reasoning-oriented datasets to improve instruction following and reasoning performance.

License:

Apache License 2.0

Inference parameters

Generation Parameters:
codecorrectness - until=["<|im_end|>"];do_sample=false;temperature=0;max_gen_toks=10000; \ncodelintereval - do_sample=true;temperature=0.6;max_gen_toks=10000;until=["<|im_end|>"]; \njavatestgen - do_sample=true;max_gen_toks=10000;temperature=0.2;top_p=0.9;until=["<|im_end|>"]; \nrealcode - do_sample=true;max_gen_toks=10000;temperature=0.7;repetition_penalty=1.05;top_p=0.8;until=["<|im_end|>"]; \nrealcodejava - do_sample=true;max_gen_toks=10000;temperature=0.7;repetition_penalty=1.05;top_p=0.8;until=["<|im_end|>"]; \nrucodeeval_code - do_sample=true;temperature=0.6;max_gen_toks=10000;until=["<|im_end|>"]; \nrucodereviewer - temperature=0;do_sample=false;max_gen_toks=10000;until=["<|im_end|>"]; \nruhumaneval_code - do_sample=true;temperature=0.6;max_gen_toks=10000;until=["<|im_end|>"]; \nstrucom - do_sample=false;max_gen_toks=10000;until=["<|im_end|>"]; \nunittests - do_sample=false;max_gen_toks=10000;until=["<|im_end|>"]; \nyabloco_oracle - max_gen_toks=10000;do_sample=false;until=["<|im_end|>"];

Description of the template:
{%- if tools %} {{- '<|im_start|>system \n' }} {%- if messages[0].role == 'system' %} {{- messages[0].content + ' \n \n' }} {%- endif %} {{- "# Tools \n \nYou may call one or more functions to assist with the user query. \n \nYou are provided with function signatures within <tools></tools> XML tags: \n<tools>" }} {%- for tool in tools %} {{- " \n" }} {{- tool | tojson }} {%- endfor %} {{- " \n</tools> \n \nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags: \n<tool_call> \n{"name": <function-name>, "arguments": <args-json-object>} \n</tool_call><|im_end|> \n" }} {%- else %} {%- if messages[0].role == 'system' %} {{- '<|im_start|>system \n' + messages[0].content + '<|im_end|> \n' }} {%- endif %} {%- endif %} {%- set ns = namespace(multi_step_tool=true, last_query_index=messages|length - 1) %} {%- for message in messages[::-1] %} {%- set index = (messages|length - 1) - loop.index0 %} {%- if ns.multi_step_tool and message.role == "user" and message.content is string and not(message.content.startswith('<tool_response>') and message.content.endswith('</tool_response>')) %} {%- set ns.multi_step_tool = false %} {%- set ns.last_query_index = index %} {%- endif %} {%- endfor %} {%- for message in messages %} {%- if message.content is string %} {%- set content = message.content %} {%- else %} {%- set content = '' %} {%- endif %} {%- if (message.role == "user") or (message.role == "system" and not loop.first) %} {{- '<|im_start|>' + message.role + ' \n' + content + '<|im_end|>' + ' \n' }} {%- elif message.role == "assistant" %} {%- set reasoning_content = '' %} {%- if message.reasoning_content is string %} {%- set reasoning_content = message.reasoning_content %} {%- else %} {%- if '</think>' in content %} {%- set reasoning_content = content.split('</think>')[0].rstrip(' \n').split('<think>')[-1].lstrip(' \n') %} {%- set content = content.split('</think>')[-1].lstrip(' \n') %} {%- endif %} {%- endif %} {%- if loop.index0 > ns.last_query_index %} {%- if loop.last or (not loop.last and reasoning_content) %} {{- '<|im_start|>' + message.role + ' \n<think> \n' + reasoning_content.strip(' \n') + ' \n</think> \n \n' + content.lstrip(' \n') }} {%- else %} {{- '<|im_start|>' + message.role + ' \n' + content }} {%- endif %} {%- else %} {{- '<|im_start|>' + message.role + ' \n' + content }} {%- endif %} {%- if message.tool_calls %} {%- for tool_call in message.tool_calls %} {%- if (loop.first and content) or (not loop.first) %} {{- ' \n' }} {%- endif %} {%- if tool_call.function %} {%- set tool_call = tool_call.function %} {%- endif %} {{- '<tool_call> \n{"name": "' }} {{- tool_call.name }} {{- '", "arguments": ' }} {%- if tool_call.arguments is string %} {{- tool_call.arguments }} {%- else %} {{- tool_call.arguments | tojson }} {%- endif %} {{- '} \n</tool_call>' }} {%- endfor %} {%- endif %} {{- '<|im_end|> \n' }} {%- elif message.role == "tool" %} {%- if loop.first or (messages[loop.index0 - 1].role != "tool") %} {{- '<|im_start|>user' }} {%- endif %} {{- ' \n<tool_response> \n' }} {{- content }} {{- ' \n</tool_response>' }} {%- if loop.last or (messages[loop.index0 + 1].role != "tool") %} {{- '<|im_end|> \n' }} {%- endif %} {%- endif %} {%- endfor %} {%- if add_generation_prompt %} {{- '<|im_start|>assistant \n' }} {%- if enable_thinking is defined and enable_thinking is false %} {{- '<think> \n \n</think> \n \n' }} {%- endif %} {%- endif %}

Qwen3-14B

Ratings for leaderboard tasks

Information about the submission

Team:

Name of the ML model:

Link to the ML model:

Model size

Model type:

Additional links:

Architecture description:

Description of the training:

Pretrain data:

License:

Inference parameters

Confirm the deletion of the sub