BioReason
BioReason
Description
BioReason is an environment for evaluating biological reasoning derived from the BioReason benchmark. Based on the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway database, agents answer questions about molecular pathways and genetic mechanisms that elucidate mechanistic connections between genetic variants and disease phenotypes. An LLM grader evaluates answer correctness.
Capabilities
- Multi-step causal reasoning across molecular networks
- Integration of pathway and variant data
- Precise molecular mechanism identification
- Clinical database integration (ClinVar, dbSNP, OMIM, COSM)
- Standardized molecular network representation
Compute Requirements
Agents are given a standard environment with no sandbox or file system access.
License
Tasks
Tasks span multiple splits covering different aspects of biological reasoning. Each entry contains a question about genetic variants and their mechanistic connections to disease phenotypes.
Reward Structure
This is a sparse reward environment with LLM-based grading:
- Agent receives a biological reasoning question
- Agent submits an answer via the
answertool - An LLM grader (gpt-4.1) evaluates the answer against the reference
- Binary reward: 1.0 if correct, 0.0 if incorrect
Data
Data is derived from the KEGG pathway database with integration from clinical databases (ClinVar, dbSNP, OMIM, COSM). Task data is stored on the OpenReward platform.
Tools
| Tool | Description |
|---|---|
answer | Submit answer for LLM-based grading |
Time Horizon
Single-turn. Each task is evaluated in a single interaction.
Environment Difficulty
Model performance on BioReason KEGG benchmark (290 test samples):
| Model | Accuracy | F1-Score |
|---|---|---|
| Evo2 + Qwen3-4B (+GRPO) | 98.28% | 93.05% |
| Evo2 + Qwen3-4B | 97.24% | 86.30% |
| Qwen3-4B | 93.48% | 85.44% |
Other Environment Requirements
OpenAI API key required for LLM-based grading. Pass via secrets={"openai_api_key": "..."}.
Safety
Agents in BioReason answer biological reasoning questions in a standard environment. The environment does not present direct safety risks.
Citation
@misc{fallahpour2025bioreasonincentivizingmultimodalbiological,
title={BioReason: Incentivizing Multimodal Biological Reasoning within a DNA-LLM Model},
author={Adibvafa Fallahpour and Andrew Magnuson and Purav Gupta and Shihao Ma and Jack Naimer and Arnav Shah and Haonan Duan and Omar Ibrahim and Hani Goodarzi and Chris J. Maddison and Bo Wang},
year={2025},
eprint={2505.23579},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2505.23579}
}