llm-srbench

Name: arXiv/llm-srbench
Author: arXiv

arXiv/llm-srbench

Description

LLM-SRBench is a comprehensive benchmark for evaluating LLM-based scientific equation discovery, comprising 239 challenging problems across four scientific domains designed to prevent trivial memorization. It includes two categories—LSR-Transform, which converts common physical models into less common mathematical representations to test reasoning beyond memorized forms, and LSR-Synth, which creates synthetic data-driven discovery problems—and shows current methods reach at best 31.5% symbolic accuracy.

arXiv

Leaderboard

Loading leaderboard...

Implementations (1)

Environment	Stars	Last Updated
parshinsh/llmsr-bench-full	3	3 months ago