VolleyBench

Description

VolleyBench is an environment for building machine learning models of international volleyball and trading those models on historical betting markets. Agents develop ML strategies using historical match data from Women's Volleyball World Championships, place bets on match outcomes, and manage bankroll across a 1-year window of tournament play.

Capabilities

Developing machine learning models for volleyball match prediction
Analyzing team performance and historical matchup data
Backtesting models against historical betting odds
Bankroll management and bet execution

Compute Requirements

Agents are given a sandbox with file system access and scientific Python libraries (pandas, numpy).

Tasks

There is one split in this environment:

Train: 3 scenarios

Scenario	Start Date	Starting Bankroll	Training Data
early-volleyball	January 2014	$100	Up to 2014
mid-volleyball	January 2018	$150	Up to 2018
recent-volleyball	January 2022	$200	Up to 2022

Each scenario covers a 1-year window of volleyball tournament play.

Reward Structure

This is a dense, verifiable reward environment. Rewards occur after each matchday. The reward is calculated as the difference in log wealth before and after betting, i.e:

$\log{W_{t+1}} - \log{W_{t}}$

Agents must place at least one bet per matchday. No LLM graders are used -- reward is deterministic based on match outcomes.

Data

Historical volleyball match data from Women's Volleyball World Championships including team names, scores, betting odds, and match results. Training data is mounted at /tmp/gr-datasets for agents to build models.

Tools

Agents are given access to CLI tools (bash, read, write, edit, multi_edit, grep, glob, ls, todo_write) plus 4 environment-specific tools:

Tool	Description
`view_matches`	View current matchday's volleyball games with team names and betting odds.
`place_bet`	Place a bet on a match outcome (team1 or team2) with a specified amount.
`view_bankroll`	View current bankroll and active bets.
`next_matchday`	Settle bets, receive reward, and advance to the next matchday.

Time Horizon

VolleyBench is an open-ended, long-horizon environment where agents simulate a year of model development and betting across international volleyball tournaments.

Environment Difficulty

[Put environment difficulty statistics here]

Other Environment Requirements

There are no further environment requirements; VolleyBench works out of the box with the OpenReward endpoint without any external API keys.

Safety

Agents in VolleyBench are told to maximize their long-run bankroll growth. The environment does not present direct safety risks, as agents only interact with historical data through betting decisions on public odds.

There may be indirect risks, however, in that an agent that is taught to maximize long-run wealth may blindly follow this objective when tested in other environments, leading it to pursue unethical objectives. Our advice is that multi-environment training runs involving VolleyBench should include other environments that teach agents to respect ethical norms so that the agent understands a broader category of objectives than just maximizing wealth.

Citation

@dataset{GRVolleyBench,
  author    = {General Reasoning Inc. Team},
  title     = {VolleyBench},
  year      = {2026},
  publisher = {OpenReward},
  url       = {https://www.openreward.ai/GeneralReasoning/VolleyBench}
}

Tools

Tools available in the environment

No tools available for this environment, it probably hasn't been indexed yet.

Compute Configuration

Resource allocation for this environment.

Component	Configuration
Environment Server	1 vCPU / 4 GB RAM
Sandbox Machine	1 vCPU / 4 GB RAM

Estimated Cost

Pay per second of active session usage. Billing starts when your session begins and stops when it ends.

Component	Cost / second
Environment	$0.0000320
Sandbox	$0.0000320
Total	$0.0000640

Examples

5-minute session$0.0192

1-hour session$0.2304