Gemma-4-E4B-it

MERA Создан 20.04.2026 11:02

Оценки по задачам лидерборда

Таблица скроллится влево

Борд Результат Attempted Score Coverage Место в рейтинге
Мульти 0.467 0.467 1 3
Изображения 0.474 0.474 1 7
Аудио 0.377 0.377 1 6
Видео 0.559 0.559 1 8

Задачи

Таблица скроллится влево

Задача Модальность Результат Метрика
0.662
EM JudgeScore
0.304
EM F1
0.425
EM JudgeScore
0.347
EM JudgeScore
0.361
EM JudgeScore
0.419
EM JudgeScore
0.67
EM JudgeScore
0.098
EM JudgeScore
0.597
EM JudgeScore
0.596
EM JudgeScore
0.486
EM JudgeScore
0.525
EM JudgeScore
0.361
EM JudgeScore
0.546
EM JudgeScore
0.557
EM JudgeScore
0.126
EM JudgeScore
culture 0.036 / 0.157
business 0.048 / 0.294
medicine 0.033 / 0.224
social_sciences 0.061 / 0.3
fundamental_sciences 0.027 / 0.171
applied_sciences 0.06 / 0.231
0.59
EM JudgeScore
biology 0.616 / 0.666
chemistry 0.658 / 0.686
physics 0.681 / 0.744
economics 0.564 / 0.605
ru 0.402 / 0.422
all 0.449 / 0.479
0.736
EM JudgeScore
biology 0.439 / 0.456
chemistry 0.627 / 0.642
physics 0.798 / 0.879
science 0.805 / 0.805

Информация о сабмите

Версия MERA
v1.0.0
Версия Torch
2.8.0
Версия кодовой базы
0f32158
Версия CUDA
12.8
Precision весов модели
bfloat16
Сид
1234
Батч
1
Версия transformers
4.57.1
Количество GPU и их тип
1 x NVIDIA A100-SXM4-80GB
Архитектура
local-chat-completions

Команда:

MERA

Название ML-модели:

Gemma-4-E4B-it

Ссылка на ML-модель:

https://huggingface.co/google/gemma-4-E4B-it

Размер модели

8.0B

Тип модели:

Открытая

SFT

Описание архитектуры:

Gemma is a family of open models built by Google DeepMind. Gemma 4 models are multimodal, handling text and image input (with audio supported on small models) and generating text output. This release includes open-weights models in both pre-trained and instruction-tuned variants. Gemma 4 features a context window of up to 256K tokens and maintains multilingual support in over 140 languages. Featuring both Dense and Mixture-of-Experts (MoE) architectures, Gemma 4 is well-suited for tasks like text generation, coding, and reasoning. The models are available in four distinct sizes: E2B, E4B, 26B A4B, and 31B. Their diverse sizes make them deployable in environments ranging from high-end phones to laptops and servers, democratizing access to state-of-the-art AI.

Лицензия:

Apache License 2.0

Параметры инференса

Параметры генерации:
labtabvqa - until=["<eos>"];do_sample=false;temperature=0;max_gen_toks=10000; \nrealvqa - until=["<eos>"];do_sample=false;temperature=0;max_gen_toks=10000; \nruclevr - until=["<eos>"];do_sample=false;temperature=0;max_gen_toks=10000; \nrucommonvqa - until=["<eos>"];do_sample=false;temperature=0;max_gen_toks=10000; \nruhhh_image - until=["<eos>"];do_sample=false;temperature=0;max_gen_toks=10000; \nrumathvqa - until=["<eos>"];do_sample=false;temperature=0;max_gen_toks=10000; \nrunaturalsciencevqa_biology - until=["<eos>"];do_sample=false;temperature=0;max_gen_toks=10000; \nrunaturalsciencevqa_chemistry - until=["<eos>"];do_sample=false;temperature=0;max_gen_toks=10000; \nrunaturalsciencevqa_earth_science - until=["<eos>"];do_sample=false;temperature=0;max_gen_toks=10000; \nrunaturalsciencevqa_physics - until=["<eos>"];do_sample=false;temperature=0;max_gen_toks=10000; \nschoolsciencevqa_biology - until=["<eos>"];do_sample=false;temperature=0;max_gen_toks=10000; \nschoolsciencevqa_chemistry - until=["<eos>"];do_sample=false;temperature=0;max_gen_toks=10000; \nschoolsciencevqa_earth_science - until=["<eos>"];do_sample=false;temperature=0;max_gen_toks=10000; \nschoolsciencevqa_economics - until=["<eos>"];do_sample=false;temperature=0;max_gen_toks=10000; \nschoolsciencevqa_history_all - until=["<eos>"];do_sample=false;temperature=0;max_gen_toks=10000; \nschoolsciencevqa_history_ru - until=["<eos>"];do_sample=false;temperature=0;max_gen_toks=10000; \nschoolsciencevqa_physics - until=["<eos>"];do_sample=false;temperature=0;max_gen_toks=10000; \nunisciencevqa_applied_sciences - until=["<eos>"];do_sample=false;temperature=0;max_gen_toks=10000; \nunisciencevqa_business - until=["<eos>"];do_sample=false;temperature=0;max_gen_toks=10000; \nunisciencevqa_cultural_studies - until=["<eos>"];do_sample=false;temperature=0;max_gen_toks=10000; \nunisciencevqa_fundamental_sciences - until=["<eos>"];do_sample=false;temperature=0;max_gen_toks=10000; \nunisciencevqa_health_and_medicine - until=["<eos>"];do_sample=false;temperature=0;max_gen_toks=10000; \nunisciencevqa_social_sciences - until=["<eos>"];do_sample=false;temperature=0;max_gen_toks=10000; \nweird - until=["<eos>"];do_sample=false;temperature=0;max_gen_toks=10000; \ncommonvideoqa - until=["<eos>"];do_sample=false;temperature=0;max_gen_toks=10000; \nrealvideoqa - until=["<eos>"];do_sample=false;temperature=0;max_gen_toks=10000; \nruhhh_video - until=["<eos>"];do_sample=false;temperature=0;max_gen_toks=10000;

Размер контекста:
10000

Описание темплейта:
{%- macro format_parameters(properties, required) -%} {%- set standard_keys = ['description', 'type', 'properties', 'required', 'nullable'] -%} {%- set ns = namespace(found_first=false) -%} {%- for key, value in properties | dictsort -%} {%- set add_comma = false -%} {%- if key not in standard_keys -%} {%- if ns.found_first %},{% endif -%} {%- set ns.found_first = true -%} {{ key }}:{ {%- if value['description'] -%} description:<|"|>{{ value['description'] }}<|"|> {%- set add_comma = true -%} {%- endif -%} {%- if value['type'] | upper == 'STRING' -%} {%- if value['enum'] -%} {%- if add_comma %},{%- else -%} {%- set add_comma = true -%} {% endif -%} enum:{{ format_argument(value['enum']) }} {%- endif -%} {%- elif value['type'] | upper == 'ARRAY' -%} {%- if value['items'] is mapping and value['items'] -%} {%- if add_comma %},{%- else -%} {%- set add_comma = true -%} {% endif -%} items:{ {%- set ns_items = namespace(found_first=false) -%} {%- for item_key, item_value in value['items'] | dictsort -%} {%- if item_value is not none -%} {%- if ns_items.found_first %},{% endif -%} {%- set ns_items.found_first = true -%} {%- if item_key == 'properties' -%} properties:{ {%- if item_value is mapping -%} {{- format_parameters(item_value, value['items']['required'] | default([])) -}} {%- endif -%} } {%- elif item_key == 'required' -%} required:[ {%- for req_item in item_value -%} <|"|>{{- req_item -}}<|"|> {%- if not loop.last %},{% endif -%} {%- endfor -%} ] {%- elif item_key == 'type' -%} {%- if item_value is string -%} type:{{ format_argument(item_value | upper) }} {%- else -%} type:{{ format_argument(item_value | map('upper') | list) }} {%- endif -%} {%- else -%} {{ item_key }}:{{ format_argument(item_value) }} {%- endif -%} {%- endif -%} {%- endfor -%} } {%- endif -%} {%- endif -%} {%- if value['nullable'] %} {%- if add_comma %},{%- else -%} {%- set add_comma = true -%} {% endif -%} nullable:true {%- endif -%} {%- if value['type'] | upper == 'OBJECT' -%} {%- if value['properties'] is defined and value['properties'] is mapping -%} {%- if add_comma %},{%- else -%} {%- set add_comma = true -%} {% endif -%} properties:{ {{- format_parameters(value['properties'], value['required'] | default([])) -}} } {%- elif value is mapping -%} {%- if add_comma %},{%- else -%} {%- set add_comma = true -%} {% endif -%} properties:{ {{- format_parameters(value, value['required'] | default([])) -}} } {%- endif -%} {%- if value['required'] -%} {%- if add_comma %},{%- else -%} {%- set add_comma = true -%} {% endif -%} required:[ {%- for item in value['required'] | default([]) -%} <|"|>{{- item -}}<|"|> {%- if not loop.last %},{% endif -%} {%- endfor -%} ] {%- endif -%} {%- endif -%} {%- if add_comma %},{%- else -%} {%- set add_comma = true -%} {% endif -%} type:<|"|>{{ value['type'] | upper }}<|"|>} {%- endif -%} {%- endfor -%} {%- endmacro -%} {%- macro format_function_declaration(tool_data) -%} declaration:{{- tool_data['function']['name'] -}}{description:<|"|>{{- tool_data['function']['description'] -}}<|"|> {%- set params = tool_data['function']['parameters'] -%} {%- if params -%} ,parameters:{ {%- if params['properties'] -%} properties:{ {{- format_parameters(params['properties'], params['required']) -}} }, {%- endif -%} {%- if params['required'] -%} required:[ {%- for item in params['required'] -%} <|"|>{{- item -}}<|"|> {{- ',' if not loop.last -}} {%- endfor -%} ], {%- endif -%} {%- if params['type'] -%} type:<|"|>{{- params['type'] | upper -}}<|"|>} {%- endif -%} {%- endif -%} {%- if 'response' in tool_data['function'] -%} {%- set response_declaration = tool_data['function']['response'] -%} ,response:{ {%- if response_declaration['description'] -%} description:<|"|>{{- response_declaration['description'] -}}<|"|>, {%- endif -%} {%- if response_declaration['type'] | upper == 'OBJECT' -%} type:<|"|>{{- response_declaration['type'] | upper -}}<|"|>} {%- endif -%} {%- endif -%} } {%- endmacro -%} {%- macro format_argument(argument, escape_keys=True) -%} {%- if argument is string -%} {{- '<|"|>' + argument + '<|"|>' -}} {%- elif argument is boolean -%} {{- 'true' if argument else 'false' -}} {%- elif argument is mapping -%} {{- '{' -}} {%- set ns = namespace(found_first=false) -%} {%- for key, value in argument | dictsort -%} {%- if ns.found_first %},{% endif -%} {%- set ns.found_first = true -%} {%- if escape_keys -%} {{- '<|"|>' + key + '<|"|>' -}} {%- else -%} {{- key -}} {%- endif -%} :{{- format_argument(value, escape_keys=escape_keys) -}} {%- endfor -%} {{- '}' -}} {%- elif argument is sequence -%} {{- '[' -}} {%- for item in argument -%} {{- format_argument(item, escape_keys=escape_keys) -}} {%- if not loop.last %},{% endif -%} {%- endfor -%} {{- ']' -}} {%- else -%} {{- argument -}} {%- endif -%} {%- endmacro -%} {%- macro strip_thinking(text) -%} {%- set ns = namespace(result='') -%} {%- for part in text.split('<channel|>') -%} {%- if '<|channel>' in part -%} {%- set ns.result = ns.result + part.split('<|channel>')[0] -%} {%- else -%} {%- set ns.result = ns.result + part -%} {%- endif -%} {%- endfor -%} {{- ns.result | trim -}} {%- endmacro -%} {%- macro format_tool_response_block(tool_name, response) -%} {{- '<|tool_response>' -}} {%- if response is mapping -%} {{- 'response:' + tool_name + '{' -}} {%- for key, value in response | dictsort -%} {{- key -}}:{{- format_argument(value, escape_keys=False) -}} {%- if not loop.last %},{% endif -%} {%- endfor -%} {{- '}' -}} {%- else -%} {{- 'response:' + tool_name + '{value:' + format_argument(response, escape_keys=False) + '}' -}} {%- endif -%} {{- '<tool_response|>' -}} {%- endmacro -%} {%- set ns = namespace(prev_message_type=None) -%} {%- set loop_messages = messages -%} {{- bos_token -}} {#- Handle System/Tool Definitions Block -#} {%- if (enable_thinking is defined and enable_thinking) or tools or messages[0]['role'] in ['system', 'developer'] -%} {{- '<|turn>system\n' -}} {#- Inject Thinking token at the very top of the FIRST system turn -#} {%- if enable_thinking is defined and enable_thinking -%} {{- '<|think|>\n' -}} {%- set ns.prev_message_type = 'think' -%} {%- endif -%} {%- if messages[0]['role'] in ['system', 'developer'] -%} {{- messages[0]['content'] | trim -}} {%- set loop_messages = messages[1:] -%} {%- endif -%} {%- if tools -%} {%- for tool in tools %} {{- '<|tool>' -}} {{- format_function_declaration(tool) | trim -}} {{- '<tool|>' -}} {%- endfor %} {%- set ns.prev_message_type = 'tool' -%} {%- endif -%} {{- '<turn|>\n' -}} {%- endif %} {#- Pre-scan: find last user message index for reasoning guard -#} {%- set ns_turn = namespace(last_user_idx=-1) -%} {%- for i in range(loop_messages | length) -%} {%- if loop_messages[i]['role'] == 'user' -%} {%- set ns_turn.last_user_idx = i -%} {%- endif -%} {%- endfor -%} {#- Loop through messages -#} {%- for message in loop_messages -%} {%- if message['role'] != 'tool' -%} {%- set ns.prev_message_type = None -%} {%- set role = 'model' if message['role'] == 'assistant' else message['role'] -%} {#- Detect continuation: suppress duplicate <|turn>model when previous non-tool message was also assistant -#} {%- set prev_nt = namespace(role=None, found=false) -%} {%- if loop.index0 > 0 -%} {%- for j in range(loop.index0 - 1, -1, -1) -%} {%- if not prev_nt.found -%} {%- if loop_messages[j]['role'] != 'tool' -%} {%- set prev_nt.role = loop_messages[j]['role'] -%} {%- set prev_nt.found = true -%} {%- endif -%} {%- endif -%} {%- endfor -%} {%- endif -%} {%- set continue_same_model_turn = (role == 'model' and prev_nt.role == 'assistant') -%} {%- if not continue_same_model_turn -%} {{- '<|turn>' + role + '\n' }} {%- endif -%} {#- Render reasoning/reasoning_content as thinking channel -#} {%- set thinking_text = message.get('reasoning') or message.get('reasoning_content') -%} {%- if thinking_text and loop.index0 > ns_turn.last_user_idx and message.get('tool_calls') -%} {{- '<|channel>thought\n' + thinking_text + '\n<channel|>' -}} {%- endif -%} {%- if message['tool_calls'] -%} {%- for tool_call in message['tool_calls'] -%} {%- set function = tool_call['function'] -%} {{- '<|tool_call>call:' + function['name'] + '{' -}} {%- if function['arguments'] is mapping -%} {%- set ns_args = namespace(found_first=false) -%} {%- for key, value in function['arguments'] | dictsort -%} {%- if ns_args.found_first %},{% endif -%} {%- set ns_args.found_first = true -%} {{- key -}}:{{- format_argument(value, escape_keys=False) -}} {%- endfor -%} {%- elif function['arguments'] is string -%} {{- function['arguments'] -}} {%- endif -%} {{- '}<tool_call|>' -}} {%- endfor -%} {%- set ns.prev_message_type = 'tool_call' -%} {%- endif -%} {%- set ns_tr_out = namespace(flag=false) -%} {%- if message.get('tool_responses') -%} {#- Legacy: tool_responses embedded on the assistant message (Google/Gemma native) -#} {%- for tool_response in message['tool_responses'] -%} {{- format_tool_response_block(tool_response['name'] | default('unknown'), tool_response['response']) -}} {%- set ns_tr_out.flag = true -%} {%- set ns.prev_message_type = 'tool_response' -%} {%- endfor -%} {%- elif message.get('tool_calls') -%} {#- OpenAI Chat Completions: forward-scan consecutive role:tool messages -#} {%- set ns_tool_scan = namespace(stopped=false) -%} {%- for k in range(loop.index0 + 1, loop_messages | length) -%} {%- if ns_tool_scan.stopped -%} {%- elif loop_messages[k]['role'] != 'tool' -%} {%- set ns_tool_scan.stopped = true -%} {%- else -%} {%- set follow = loop_messages[k] -%} {#- Resolve tool_call_id to function name -#} {%- set ns_tname = namespace(name=follow.get('name') | default('unknown')) -%} {%- for tc in message['tool_calls'] -%} {%- if tc.get('id') == follow.get('tool_call_id') -%} {%- set ns_tname.name = tc['function']['name'] -%} {%- endif -%} {%- endfor -%} {#- Handle content as string or content-parts array -#} {%- set tool_body = follow.get('content') -%} {%- if tool_body is string -%} {{- format_tool_response_block(ns_tname.name, tool_body) -}} {%- elif tool_body is sequence and tool_body is not string -%} {%- set ns_txt = namespace(s='') -%} {%- for part in tool_body -%} {%- if part.get('type') == 'text' -%} {%- set ns_txt.s = ns_txt.s + (part.get('text') | default('')) -%} {%- endif -%} {%- endfor -%} {{- format_tool_response_block(ns_tname.name, ns_txt.s) -}} {%- else -%} {{- format_tool_response_block(ns_tname.name, tool_body) -}} {%- endif -%} {%- set ns_tr_out.flag = true -%} {%- set ns.prev_message_type = 'tool_response' -%} {%- endif -%} {%- endfor -%} {%- endif -%} {%- if message['content'] is string -%} {%- if role == 'model' -%} {{- strip_thinking(message['content']) -}} {%- else -%} {{- message['content'] | trim -}} {%- endif -%} {%- elif message['content'] is sequence -%} {%- for item in message['content'] -%} {%- if item['type'] == 'text' -%} {%- if role == 'model' -%} {{- strip_thinking(item['text']) -}} {%- else -%} {{- item['text'] | trim -}} {%- endif -%} {%- elif item['type'] == 'image' -%} {{- '<|image|>' -}} {%- set ns.prev_message_type = 'image' -%} {%- elif item['type'] == 'audio' -%} {{- '<|audio|>' -}} {%- set ns.prev_message_type = 'audio' -%} {%- elif item['type'] == 'video' -%} {{- '<|video|>' -}} {%- set ns.prev_message_type = 'video' -%} {%- endif -%} {%- endfor -%} {%- endif -%} {%- if ns.prev_message_type == 'tool_call' and not ns_tr_out.flag -%} {{- '<|tool_response>' -}} {%- elif not (ns_tr_out.flag and not message.get('content')) -%} {{- '<turn|>\n' -}} {%- endif -%} {%- endif -%} {%- endfor -%} {%- if add_generation_prompt -%} {%- if ns.prev_message_type != 'tool_response' and ns.prev_message_type != 'tool_call' -%} {{- '<|turn>model\n' -}} {%- endif -%} {%- endif -%}