openai / lean-gym Goto Github PK

License: Apache License 2.0

Lean 98.80% Shell 1.20%

lean-gym's Introduction

lean-gym

This repository lets you interact with Lean through a REPL. See Formal Mathematics Statement Curriculum Learning for a presentation of lean-gym.

Setup

# Download pre-built binaries and build the project (targeting mathlib).
bash ./scripts/setup.sh

Usage

lean --run src/repl.lean

Starts a fresh REPL. Once started, the REPL accepts the following commands:

init_search: takes a declaration name as well as a list of open namespaces to initialize a search at the given declaration opening the provided namespaces, and returning the initial tactic state (along with a fresh search_id and tactic_state_id).
run_tac: takes a search_id, a tactic_state_id and a tactic to apply at the tactic state denoted by the provided ids.
clear_search: takes a search_id to clear all state related to a search.

The commands can be interleaved freely enabling the parallelization of multiple proof searches against the same REPL.

$ lean --run src/repl.lean

["init_search", ["int.prime.dvd_mul", ""]]
{"error":null,"proof_steps":[],"search_id":"0","tactic_state":"⊢ ∀ {m n : ℤ} {p : ℕ}, nat.prime p → ↑p ∣ m * n → p ∣ m.nat_abs ∨ p ∣ n.nat_abs","tactic_state_id":"0"}

["run_tac",["0","0","intros"]]
{"error":null,"proof_steps":[],"search_id":"0","tactic_state":"m n : ℤ,\np : ℕ,\nhp : nat.prime p,\nh : ↑p ∣ m * n\n⊢ p ∣ m.nat_abs ∨ p ∣ n.nat_abs","tactic_state_id":"1"}

["run_tac",["0","1","apply (nat.prime.dvd_mul hp).mp"]]
{"error":null,"proof_steps":[],"search_id":"0","tactic_state":"m n : ℤ,\np : ℕ,\nhp : nat.prime p,\nh : ↑p ∣ m * n\n⊢ p ∣ m.nat_abs * n.nat_abs","tactic_state_id":"2"}

["run_tac",["0","2","rw ← int.nat_abs_mul"]]
{"error":null,"proof_steps":[],"search_id":"0","tactic_state":"m n : ℤ,\np : ℕ,\nhp : nat.prime p,\nh : ↑p ∣ m * n\n⊢ p ∣ (m * n).nat_abs","tactic_state_id":"3"}

["run_tac",["0","3","exact int.coe_nat_dvd_left.mp h"]]
{"error":null,"proof_steps":[],"search_id":"0","tactic_state":"no goals","tactic_state_id":"4"}

Declaration names

Declaration names and open namespaces are available in the data/ directory to be used with the init_search command.

Notes

The REPL is subject to crashes in rare cases. Empirically such crash happens no more than ~0.01% of the time.

When a tactic state is reached with no left goals, some custom logic is run to check that the resulting proof's type matches the top level goal type and does not rely on sorry. We also check for the presence of undefined in the proof term. As an example, the following MiniF2F proofs will safely fail with error proof_validation_failed.

["init_search", ["mathd_algebra_35", ""]]
["run_tac", ["0", "0", "intros"]]
["run_tac", ["0", "1", "sorry"]]

["init_search", ["induction_divisibility_3divnto3m2n", ""]]
["run_tac", ["0", "0", "intros"]]
["run_tac", ["0", "1", "rw [add_comm]"]]
["run_tac", ["0", "2", "have h3 : 1 * (n + 1) ≤ (n + 1)"]]
["run_tac", ["0", "3", "rw one_mul"]]
["run_tac", ["0", "4", "apply dvd_trans"]]
["run_tac", ["0", "5", "swap"]]
["run_tac", ["0", "6", "simp []"]]

["init_search", ["mathd_numbertheory_13", ""]]
["run_tac", ["0", "0", "intros u v hu hv hsum"]]
["run_tac", ["0", "1", "intro h"]]
["run_tac", ["0", "2", "contrapose h"]]
["run_tac", ["0", "3", "intro hn"]]
["run_tac", ["0", "4", "exact not_lt_of_lt hn undefined"]]

lean-gym's People

Contributors

Stargazers

Watchers

lean-gym's Issues

fail to setup

leanpkg: command not found when set up.

Proof checks should happen after every tactic

Right now it seems that all proof checks are at the end. This violates the Markovian assumption of Lean gym since a bad tactic upstream can lead to a failure of the final tactic step. Most of the current checks for sorry, undefined, and maybe some of the others are valid checks to do after every proof step. The only one I know for sure won't work is the metavariable check. That isn't supposed to hold until a proof is complete.

(Willing to PR) Support multi sub-goals, like what is done in `HyperTree Proof Search for Neural Theorem Proving`

Hi thanks for the environment! The recent paper https://arxiv.org/abs/2205.11491 let one goal create multiple subgoals, but currently lean-gym will do something like 2 goals\ngoal-one\n\ngoal-two instead of really creating two state-ids.

So, is there a plan to implement it? Or, if I implement it, will you accept the PR?

Segmentation fault error

I am getting segmentation fault error after trying the following:

["init_search",["one_div_add_one_div","set"]]
{"error":null,"proof_steps":[],"search_id":"0","tactic_state":"⊢ ∀ {K : Type u} [_inst_1 : field K] {a b : K}, a ≠ 0 → b ≠ 0 → 1 / a + 1 / b = (a + b) / (a * b)","tactic_state_id":"0"}
["run_tac",["0","0","intros"]]
{"error":null,"proof_steps":[],"search_id":"0","tactic_state":"K : Type u,\n_inst_1 : field K,\na b : K,\nha : a ≠ 0,\nhb : b ≠ 0\n⊢ 1 / a + 1 / b = (a + b) / (a * b)","tactic_state_id":"1"}
["run_tac",["0","1","rw one_div"]]
{"error":null,"proof_steps":[],"search_id":"0","tactic_state":"K : Type u,\n_inst_1 : field K,\na b : K,\nha : a ≠ 0,\nhb : b ≠ 0\n⊢ a⁻¹ + 1 / b = (a + b) / (a * b)","tactic_state_id":"2"}
["run_tac",["0","2","simp [div_eq_mul_inv, inv_eq_one_div]"]]
Segmentation fault (core dumped)

However, on another machine, this was the output (timeout!):

["init_search",["one_div_add_one_div","set"]]
{"error":null,"proof_steps":[],"search_id":"0","tactic_state":"⊢ ∀ {K : Type u} [_inst_1 : field K] {a b : K}, a ≠ 0 → b ≠ 0 → 1 / a + 1 / b = (a + b) / (a * b)","tactic_state_id":"0"}
["run_tac",["0","0","intros"]]
{"error":null,"proof_steps":[],"search_id":"0","tactic_state":"K : Type u,\n_inst_1 : field K,\na b : K,\nha : a ≠ 0,\nhb : b ≠ 0\n⊢ 1 / a + 1 / b = (a + b) / (a * b)","tactic_state_id":"1"}
["run_tac",["0","1","rw one_div"]]
{"error":null,"proof_steps":[],"search_id":"0","tactic_state":"K : Type u,\n_inst_1 : field K,\na b : K,\nha : a ≠ 0,\nhb : b ≠ 0\n⊢ a⁻¹ + 1 / b = (a + b) / (a * b)","tactic_state_id":"2"}
["run_tac",["0","2","simp [div_eq_mul_inv, inv_eq_one_div]"]]
{"error":"gen_tac_and_capture_res_failed: pos=none msg=try_for tactic failed, timeout","proof_steps":[],"search_id":null,"tactic_state":null,"tactic_state_id":null}

Lean versions used are the same in both the machines: Lean (version 3.39.1, commit 1781ded0d006, Release)

Any idea of what I might be missing?

Accessing the Global Environment

Hi,

Thanks for providing a great tool! I wonder whether it is possible to access the global environment (e.g., existing definitions and theorems either in the same file or imported from other files) using lean-gym. Thanks!

Question about PACT tactic dataset

Hi, great work here! But I got some questions when reproducing your paper "Formal Mathematics Statement Curriculum Learning". In this paper, you use the tactic dataset in PACT. However, when I follow the instructions from PACT and create the tactic dataset, most of the tactics and proofs can not reach the "no goals" state when directly applying the tactics in the dataset step by step. Only 30% of them are actually applicable and successfully proven. I wonder if I am doing something wrong here or missing some important steps?

Decreasing Performance

Hi, I have some problems with decreasing performance. Each usage of "run_tac" seems to slow down the usage of further "run_tac"s. At the end is a python file to show the behavior: it repeats (init_search, 2000x run_tac, clear_search) and takes 15s, 17s, 19s, 21s, ...
Is there a way around it or should I just reopen lean-gym periodically?

import time
import subprocess
import os
import json

lean_gym_path = os.path.dirname(os.path.realpath(__file__))
lean_gym = subprocess.Popen(['lean', '--run', 'src/repl.lean'],
                            cwd=lean_gym_path,
                            stdin=subprocess.PIPE,
                            stdout=subprocess.PIPE)

def run_lean_cmd(cmd: str):
    lean_gym.stdin.write(cmd.encode('utf-8'))
    lean_gym.stdin.flush()
    return json.loads(lean_gym.stdout.readline().decode('utf-8'))

def init_search(decl: str) -> dict:
    cmd = f'["init_search", ["{decl}", ""]]\n'
    return run_lean_cmd(cmd)

def run_tac(search_id: int, tactic_state_id: int, tactic: str) -> dict:
    cmd = f'["run_tac",["{search_id}","{tactic_state_id}","{tactic}"]]\n'
    return run_lean_cmd(cmd)

def clear_search(search_id: int) -> dict:
    cmd = f'["clear_search",["{search_id}"]]\n'
    return run_lean_cmd(cmd)

t1 = time.time()
for i in range(10000):
    t2 = time.time()
    if i > 1:
        print(t2 - t1)
    t1 = t2

    result = init_search('nat.mul_left_eq_self_iff')
    search_id = int(result['search_id'])
    state_id = int(result['tactic_state_id'])

    result = run_tac(search_id, state_id, 'intros')
    search_id = int(result['search_id'])
    state_id = int(result['tactic_state_id'])

    for state_id in range(1000):
        search_id = int(result['search_id'])
        state_id = int(result['tactic_state_id'])
        result = run_tac(search_id, state_id, 'rw mul_comm')

    clear_search(search_id)

How to search a customed theorem with lean-gym?

Hi, I wonder how to search a customed theorem with lean-gym? For example, ∀ (a b : ℝ), ∃ x, (a + b) ^ 3 = a ^ 3 + x * a ^ 2 * b + 3 * a * b ^ 2 + b ^ 3. Thanks!

How to integrate lean-gym with python?

Great job! I want to use lean-gym as an engine to train my AI just like RL gym, but I don't know how to integrate lean-gym with python. Could you please provide an example code? Thanks!

Having trouble init_search miniF2F proofs

I followed the instructions given in the README document, using the theorem int.prime.dvd_mul as a test case, and everything worked as expected. However, when I tried to replace 'int.prime.dvd_mul' with mathd_algebra_141 - a theorem from the miniF2F collection, the system returned the following error message:

"{"error":"not_a_theorem: name=mathd_algebra_141 open_ns=","proof_steps":[],"search_id":null,"tactic_state":null,"tactic_state_id":null}"

As my experience with Lean is rather limited, I'm finding it challenging to pinpoint and resolve the issue. I'd greatly appreciate any assistance or direction you could provide to help me navigate this hurdle.