Go back to the task list

ruTXTAquaBench

Type of task
Reasoning
Output format
Choosing an answer
Metric
F1
Exact Match
Domains
Agricultural industry
Statistics
dev: 110
test: 992

ruTXTAquaBench is a dataset designed to assess the professional knowledge of a model during pre-training in the field of Aquaculture.

Aquaculture is an important part of industrial agriculture, which is focused on aquatic breeding (fish, crustaceans, mollusks, algae). Aquacultural enterprises produce a valuable source of protein and help to preserve endangered species, such as sturgeon and salmon, by releasing fry into water bodies. It is strategically important to develop aquaculture for national food security and cultivate various aquatic species that cannot be harvested in the wild.

The dataset is created in Russian and is entirely original. It contains 1102 multiple-choice questions. Each question has from four to eight options, and one or several answers are correct. The topics cover several areas, such as industrial aquaculture, feeding of fish and aquatic organisms, mariculture (e.g. crayfish and shrimp breeding, pearl cultivation), as well as ichthyopathology (veterinary science, prevention and optimization of fish cultivation technologies).

Keywords: Agriculture, Agricultural Industry, Fishery, Industrial Aquaculture, Feeding of Fish and Other Aquatic Organisms, Mariculture, Crayfish and Shrimp Farming, Artificial Pearl Cultivation, Ichthyopathology.

Authors: Kuban State Agrarian University

Motivation

This task is one of eight benchmarks in the agriculture set, which is intended to assess professional knowledge in the field of aquaculture. It resembles the well-known MMLU test in its structure and purpose, and is suitable for comprehensive testing of language models for the professional quality of understanding and responses. We provide a public MMLU test version of AquaBench in Russian to assess capabilities of our model on real professional tasks.

Data description

Data fields

  • instruction — a string containing the instruction for the task;
  • inputs — a dict with the input data:
    • question — a string with the task question;
    • option_a — answer option A;
    • option_b — answer option B;
    • option_c — answer option C;
    • option_d — answer option D;
    • option_e — answer option E;
    • option_f — answer option F;
    • option_g — answer option G;
    • option_h — answer option H;
  • outputs — a string containing the right answer for the task (one or more letters (A-H) separated with comma and written in alphabetic order);
  • meta — a dict with task meta information:
    • id — an integer, the task's unique number in dataset;
    • domain — a string with the task's domain name.

Prompts

10 promptes of varying complexity were prepared for the dataset.

Example:

"Select the correct answer options on the topic “{domain}” for the question:\n{question}\n\nA. {option_a}\nB. {option_b}\nC. {option_c}\nD. {option_d}\n\nAnswer: letters only. Multiple answers should be listed in alphabetical order, separated by commas and spaces (“A, B, C”)."

Dataset Creation

All tasks in this set were written by top aquaculturists, professionally edited, and then manually double-checked by 3 different experts.

Metric

Quality metrics: Exact Match and F1.

Domains
Agricultural industry
Statistics
dev: 110
test: 992
An example from the dataset