davinci-002

Created at 26.01.2024 08:32

General assessment: 0.383

The table will scroll to the left

Task name Result Metric
BPS 0.521 Accuracy
LCS 0.124 Accuracy
RCB 0.331 / 0.178 Avg. F1 / Accuracy
USE 0.016 Grade Norm
RWSD 0.481 Accuracy
PARus 0.506 Accuracy
ruTiE 0.519 Accuracy
MultiQ 0.119 / 0.044 F1-score/EM
ruMMLU 0.613 Accuracy
CheGeKa 0.018 / 0 F1 / EM
ruModAr 0.476 Accuracy
SimpleAr 0.927 Accuracy
ruMultiAr 0.176 Accuracy
MathLogicQA 0.353 Accuracy
ruHumanEval 0.005 / 0.023 / 0.037 pass@k
ruWorldTree 0.766 / 0.765 Avg. F1 / Accuracy
ruOpenBookQA 0.675 / 0.676 Avg. F1 / Accuracy

Evaluation on diagnostic datasets:

It is not taken into account in the overall rating

The table will scroll to the left

Task name Result Metric
ruHHH

0.517

  • Honest: 0.525
  • Harmless: 0.466
  • Helpful: 0.559
Accuracy
ruHateSpeech

0.551

  • Women : 0.472
  • Man : 0.657
  • LGBT : 0.588
  • Nationality : 0.541
  • Migrants : 0.571
  • Other : 0.623
Accuracy
ruDetox
  • 0.349
  • 0.676
  • 0.665
  • 0.705

Overall average score (J)

Assessment of the preservation of meaning (SIM)

Assessment of naturalness (FL)

Style Transfer Accuracy (STA)

ruEthics
Correct God Ethical
Virtue -0.033 -0.002 -0.006
Law -0.041 -0.008 -0.041
Moral -0.029 0.001 -0.024
Justice -0.046 -0.011 0.012
Utilitarianism -0.015 -0.028 -0.028

Table results:

[[-0.033, -0.041 , -0.029, -0.046 , -0.015],
[-0.002, -0.008 , 0.001, -0.011 , -0.028],
[-0.006, -0.041 , -0.024, 0.012 , -0.028]]

5 MCC

Information about the submission:

Team:

MERA

Name of the ML model:

davinci-002

Additional links:

https://github.com/openai/openai-python

Architecture description:

GPT base model from OpenAI. Details are not disclosed.

Description of the training:

GPT base model from OpenAI. Details are not disclosed.

Pretrain data:

GPT base model from OpenAI. Details are not disclosed.

Training Details:

GPT base model from OpenAI. Details are not disclosed.

License:

Apache 2.0 license

Strategy, generation and parameters:

Code version v.1.1.0 All the parameters were not changed and are used as prepared by the organizers. Details: - OpenAI 1.10.0 - Tiktoken 0.5.2 - Context length 2049