gemma-3-4b-it

MERA Created at 22.01.2026 04:48

Ratings for leaderboard tasks

The table will scroll to the left

Board Result Attempted Score Coverage Place in the rating
Multi 0.054 0.161 0.333 44
Images 0.161 0.161 1 31

Tasks

The table will scroll to the left

Task Modality Result Metric
0.172
EM JudgeScore
0.185
EM JudgeScore
0.122
EM JudgeScore
0.027
EM JudgeScore
0.028
EM JudgeScore
0.356
EM JudgeScore
0.082
EM JudgeScore
0.265
EM JudgeScore
0.102
EM JudgeScore
culture 0.042 / 0.153
business 0.053 / 0.167
medicine 0.046 / 0.143
social_sciences 0.068 / 0.22
fundamental_sciences 0.054 / 0.121
applied_sciences 0.074 / 0.185
0.333
EM JudgeScore
biology 0.326 / 0.415
chemistry 0.252 / 0.323
physics 0.385 / 0.466
economics 0.272 / 0.328
ru 0.225 / 0.302
all 0.288 / 0.362
0.102
EM JudgeScore
biology 0.035 / 0.07
chemistry 0.06 / 0.09
physics 0.081 / 0.202
science 0.024 / 0.024

Information about the submission

Mera version
v1.0.0
Torch Version
2.8.0
The version of the codebase
7e640aa
CUDA version
12.8
Precision of the model weights
bfloat16
Seed
1234
Batch
1
Transformers version
4.57.1
The number of GPUs and their type
1 x NVIDIA A100-SXM4-80GB
Architecture
openai-chat-completions

Team:

MERA

Name of the ML model:

gemma-3-4b-it

Model size

4.0B

Model type:

Opened

SFT

Inference parameters

Generation Parameters:
labtabvqa - until=["\n\n"];do_sample=false;temperature=0; \nrealvqa - until=["\n\n"];do_sample=false;temperature=0; \nruclevr - until=["\n\n"];do_sample=false;temperature=0; \nrucommonvqa - until=["\n\n"];do_sample=false;temperature=0; \nruhhh_image - until=["\n\n"];do_sample=false;temperature=0; \nrunaturalsciencevqa_biology - until=["<|endoftext|>"];temperature=0;do_sample=false;max_gen_toks=64; \nrunaturalsciencevqa_chemistry - until=["<|endoftext|>"];temperature=0;do_sample=false;max_gen_toks=64; \nrunaturalsciencevqa_earth_science - until=["<|endoftext|>"];temperature=0;do_sample=false;max_gen_toks=64; \nrunaturalsciencevqa_physics - until=["<|endoftext|>"];temperature=0;do_sample=false;max_gen_toks=64; \nrumathvqa - until=["\n\n"];do_sample=false;temperature=0; \nweird - until=["\n\n"];do_sample=false;temperature=0; \nschoolsciencevqa_biology - until=["<|endoftext|>"];temperature=0;do_sample=false;max_gen_toks=256; \nschoolsciencevqa_chemistry - until=["<|endoftext|>"];temperature=0;do_sample=false;max_gen_toks=256; \nschoolsciencevqa_earth_science - until=["<|endoftext|>"];temperature=0;do_sample=false;max_gen_toks=256; \nschoolsciencevqa_economics - until=["<|endoftext|>"];temperature=0;do_sample=false;max_gen_toks=256; \nschoolsciencevqa_history_all - until=["<|endoftext|>"];temperature=0;do_sample=false;max_gen_toks=256; \nschoolsciencevqa_history_ru - until=["<|endoftext|>"];temperature=0;do_sample=false;max_gen_toks=256; \nschoolsciencevqa_physics - until=["<|endoftext|>"];temperature=0;do_sample=false;max_gen_toks=256; \nunisciencevqa_applied_sciences - until=["<|endoftext|>"];temperature=0;do_sample=false;max_gen_toks=256; \nunisciencevqa_business - until=["<|endoftext|>"];temperature=0;do_sample=false;max_gen_toks=256; \nunisciencevqa_cultural_studies - until=["<|endoftext|>"];temperature=0;do_sample=false;max_gen_toks=256; \nunisciencevqa_fundamental_sciences - until=["<|endoftext|>"];temperature=0;do_sample=false;max_gen_toks=256; \nunisciencevqa_health_and_medicine - until=["<|endoftext|>"];temperature=0;do_sample=false;max_gen_toks=256; \nunisciencevqa_social_sciences - until=["<|endoftext|>"];temperature=0;do_sample=false;max_gen_toks=256;

The size of the context:
128000