gemma-3-4b-it

MERA Created at 22.01.2026 04:48

Ratings for leaderboard tasks

The table will scroll to the left

Board	Result	Attempted Score	Coverage	Place in the rating
Multi	0.054	0.161	0.333	44
Images	0.161	0.161	1	31

Tasks

The table will scroll to the left

Task	Result	Metric
WEIRD	0.172	EM JudgeScore
RealVQA	0.185	EM JudgeScore
ruCLEVR	0.122	EM JudgeScore
LabTabVQA	0.027	EM JudgeScore
ruMathVQA	0.028	EM JudgeScore
ruCommonVQA	0.356	EM JudgeScore
ruHHH-Image	0.082	EM JudgeScore
ruTiE-Image	0.265	EM JudgeScore
UniScienceVQA	0.102	EM JudgeScore
culture	0.042 / 0.153
business	0.053 / 0.167
medicine	0.046 / 0.143
social_sciences	0.068 / 0.22
fundamental_sciences	0.054 / 0.121
applied_sciences	0.074 / 0.185
SchoolScienceVQA	0.333	EM JudgeScore
biology	0.326 / 0.415
chemistry	0.252 / 0.323
physics	0.385 / 0.466
economics	0.272 / 0.328
ru	0.225 / 0.302
all	0.288 / 0.362
ruNaturalScienceVQA	0.102	EM JudgeScore
biology	0.035 / 0.07
chemistry	0.06 / 0.09
physics	0.081 / 0.202
science	0.024 / 0.024

Information about the submission

Mera version

v1.0.0

Torch Version

2.8.0

The version of the codebase

7e640aa

CUDA version

12.8

Precision of the model weights

bfloat16

Seed

1234

Batch

Transformers version

4.57.1

The number of GPUs and their type

1 x NVIDIA A100-SXM4-80GB

Architecture

openai-chat-completions

Team:

MERA

Name of the ML model:

gemma-3-4b-it

Link to the ML model:

https://huggingface.co/google/gemma-3-4b-it

Model size

4.0B

Model type:

Opened

SFT

Inference parameters

Generation Parameters:
labtabvqa - until=["\n\n"];do_sample=false;temperature=0; \nrealvqa - until=["\n\n"];do_sample=false;temperature=0; \nruclevr - until=["\n\n"];do_sample=false;temperature=0; \nrucommonvqa - until=["\n\n"];do_sample=false;temperature=0; \nruhhh_image - until=["\n\n"];do_sample=false;temperature=0; \nrunaturalsciencevqa_biology - until=["<|endoftext|>"];temperature=0;do_sample=false;max_gen_toks=64; \nrunaturalsciencevqa_chemistry - until=["<|endoftext|>"];temperature=0;do_sample=false;max_gen_toks=64; \nrunaturalsciencevqa_earth_science - until=["<|endoftext|>"];temperature=0;do_sample=false;max_gen_toks=64; \nrunaturalsciencevqa_physics - until=["<|endoftext|>"];temperature=0;do_sample=false;max_gen_toks=64; \nrumathvqa - until=["\n\n"];do_sample=false;temperature=0; \nweird - until=["\n\n"];do_sample=false;temperature=0; \nschoolsciencevqa_biology - until=["<|endoftext|>"];temperature=0;do_sample=false;max_gen_toks=256; \nschoolsciencevqa_chemistry - until=["<|endoftext|>"];temperature=0;do_sample=false;max_gen_toks=256; \nschoolsciencevqa_earth_science - until=["<|endoftext|>"];temperature=0;do_sample=false;max_gen_toks=256; \nschoolsciencevqa_economics - until=["<|endoftext|>"];temperature=0;do_sample=false;max_gen_toks=256; \nschoolsciencevqa_history_all - until=["<|endoftext|>"];temperature=0;do_sample=false;max_gen_toks=256; \nschoolsciencevqa_history_ru - until=["<|endoftext|>"];temperature=0;do_sample=false;max_gen_toks=256; \nschoolsciencevqa_physics - until=["<|endoftext|>"];temperature=0;do_sample=false;max_gen_toks=256; \nunisciencevqa_applied_sciences - until=["<|endoftext|>"];temperature=0;do_sample=false;max_gen_toks=256; \nunisciencevqa_business - until=["<|endoftext|>"];temperature=0;do_sample=false;max_gen_toks=256; \nunisciencevqa_cultural_studies - until=["<|endoftext|>"];temperature=0;do_sample=false;max_gen_toks=256; \nunisciencevqa_fundamental_sciences - until=["<|endoftext|>"];temperature=0;do_sample=false;max_gen_toks=256; \nunisciencevqa_health_and_medicine - until=["<|endoftext|>"];temperature=0;do_sample=false;max_gen_toks=256; \nunisciencevqa_social_sciences - until=["<|endoftext|>"];temperature=0;do_sample=false;max_gen_toks=256;

The size of the context:
128000

gemma-3-4b-it

Ratings for leaderboard tasks

Tasks

Information about the submission

Team:

Name of the ML model:

Link to the ML model:

Model size

Model type:

Inference parameters

Confirm the deletion of the sub