GeneralReasoner

API Endpoint
Leaderboard
Loading leaderboard...
README

GeneralReasoner

OpenReward Environment

Description

GeneralReasoner is an environment for evaluating general reasoning capabilities using the WebInstruct-verified dataset from the General-Reasoner project by TIGER-AI-Lab. It provides diverse reasoning tasks spanning multiple categories and difficulty levels, with LLM-based semantic grading for flexible answer evaluation.

Capabilities

  • General reasoning evaluation across multiple domains
  • Multi-category question answering
  • Semantic answer verification
  • Varied difficulty levels

Compute Requirements

Agents are given a standard environment with no sandbox or file system access.

Tasks

There are two splits in this environment:

  • train: 228,736 tasks
  • test: 1,000 tasks

Tasks span multiple categories with varying difficulty levels.

Reward Structure

This is a single-turn environment. The agent submits an answer via the answer tool. An LLM grader (gpt-5-mini) evaluates semantic correctness against the reference answer. Reward is binary: 1.0 if correct, 0.0 if incorrect.

Data

Data consists of Parquet files sourced from the WebInstruct-verified dataset. Each row contains a question, answer, answer type, category, and difficulty level. Data is stored on the OpenReward platform.

Tools

ToolDescription
answerSubmit your answer for LLM grading. Ends the episode.

Time Horizon

Single-turn. The agent reads the question and submits one answer.

Environment Difficulty

[Put environment difficulty statistics here]

Other Environment Requirements

OpenAI API key required for LLM-based grading. Pass via secrets={"openai_api_key": "..."}.

Safety

Agents in GeneralReasoner answer reasoning questions in a standard environment. The environment does not present direct safety risks.

Citation

@inproceedings{ma2025generalreasoner,
  title={General-Reasoner: Advancing {LLM} Reasoning Across All Domains},
  author={Ma, Xueguang and Liu, Qian and Jiang, Dongfu and Zhang, Ge and Ma, Zejun and Chen, Wenhu},
  booktitle={Proceedings of the Neural Information Processing Systems (NeurIPS)},
  year={2025}
}
GeneralReasoning/GeneralReasoner | OpenReward