jasonrute / lean_proof_recording Goto Github PK

View Code? Open in Web Editor NEW

24.0 24.0 8.0 582 KB

Proof recording for Lean 3

License: Apache License 2.0

Lean 10.86% Python 44.63% Jupyter Notebook 43.86% Shell 0.66%

lean_proof_recording's People

Contributors

Stargazers

Watchers

Forkers

franklynwang dselsam wudaocode xinhjbrant yangky11 yeahrmek fzyzcjy devjuice1

lean_proof_recording's Issues

Lean file permissions have changed

The core .lean files used to be "-rw-r--r--", but I they are now "-r--r--r--" (as per a recently change to elan). So when I copy them over they keep these permissions. I need to change the permissions back, either when copying them over or write before I modify the file. (Note, the mathlib files still have write permissions, but that isn't guaranteed to last, so I should force them to be writable.)

`error: failed to synthesize type class instance for`

Hi thanks for the lib! After upgrading Lean and Mathlib it fails to compile:

[12:10:57]: $ lean --make src/empty.lean (shell=True, cwd=/Users/tom/Main/research/code/third_party/lean_proof_recording, extra_env=None)
[12:11:12]: /Users/tom/Main/research/code/third_party/lean_proof_recording/_target/deps/lean/library/init/meta/interactive.lean:972:20: error: failed to synthesize type class instance for
[12:11:12]: ⊢ (pr.recorded (with_desc ↑"{...}" parser.itactic)?).reflectable
[12:11:12]: /Users/tom/Main/research/code/third_party/lean_proof_recording/_target/deps/lean/library/init/meta/interactive.lean:989:22: error: failed to synthesize type class instance for
[12:11:12]: ⊢ (pr.recorded (with_desc ↑"{...}" parser.itactic)?).reflectable
...

I will PR after learning more about Lean :)

Parsing error using |x| local notation in calc tactic

Currently, we get the following error when trying to process the data for bernstein.lean.

data_mathlib/lean_files/mathlib/src/analysis/special_functions/bernstein.lean
Traceback (most recent call last):
  File "/root/code/formal/formal/lean/lean_proof_recording/lean_proof_recording/pipeline/extract_proof_data.py", line 487, in run
    parser_ast = parser.read_begin()
  File "/root/code/formal/formal/lean/lean_proof_recording/lean_proof_recording/parser.py", line 1108, in read_begin
    tactics = self.read_tactic_list()
  File "/root/code/formal/formal/lean/lean_proof_recording/lean_proof_recording/parser.py", line 1040, in read_tactic_list
    self.raise_error(f'Expected "," or "{right}"')
  File "/root/code/formal/formal/lean/lean_proof_recording/lean_proof_recording/parser.py", line 384, in raise_error
    raise Exception(f"{msg}:\n{self.format_file_location()}")
Exception: Expected "," or "end":
data_mathlib/lean_files/mathlib/src/analysis/special_functions/bernstein.lean:240:5
0239:     |(bernstein_approximation n f - f) x|
          ^

The issue, is two fold:

There is a local notation in the file using |, which is normally used to delimit cases in a match or in an inductively defined term proof.
It is used inside calc which is one of the few cases where we have no information from Lean on how to parse, so I have to just scan along until we reach a reasonable end point. Currently that is set as one of "end", ";", ",", "|", "<|>", a right bracket, or a user command (but taking nesting parentheses and match statements into account).

Hence it is done scanning too early on the calc tactic. Luckily this is caught as an error, since the calc tactic is inside a tactic list so we know that the calc tactic can actually only end with "," or "end" (or ";", or "<|>", if it was inside a semicolon list or alternative inside that outer tactic list.)

I see two solutions:

Scan the calc tactic in stages. Since the calc tactic is of the form calc xxx : xxx ... xxx : xxx ... xxx : xxx (where the xxx are terms), I could first scan until :, and then scan until ... or one of the end delimiters. In this particular case, that should catch the user notation since it is always on the left of the :.
Make the end points for scanning the calc tactic depend on if I am inside a tactic list or a by.

I think I like option (1) better, but let me think about it for a bit first.

Issues with doc strings when parsing itactic arguments

Here is an error message :

data_mathlib/lean_files/mathlib/src/geometry/euclidean/basic.lean
Traceback (most recent call last):
  File "/root/code/formal/formal/lean/lean_proof_recording/lean_proof_recording/pipeline/extract_proof_data.py", line 485, in run
    parser_ast = parser.read_by()
  File "/root/code/formal/formal/lean/lean_proof_recording/lean_proof_recording/parser.py", line 1098, in read_by
    tactic = self.read_maybe_semicolon_tactic()
  File "/root/code/formal/formal/lean/lean_proof_recording/lean_proof_recording/parser.py", line 1049, in read_maybe_semicolon_tactic
    tactic = self.read_maybe_alt_tactic()
  File "/root/code/formal/formal/lean/lean_proof_recording/lean_proof_recording/parser.py", line 999, in read_maybe_alt_tactic
    first_tactic = self.read_single_tactic()
  File "/root/code/formal/formal/lean/lean_proof_recording/lean_proof_recording/parser.py", line 991, in read_single_tactic
    return self.read_named_tactic()
  File "/root/code/formal/formal/lean/lean_proof_recording/lean_proof_recording/parser.py", line 852, in read_named_tactic
    assert (param_end_line, param_end_column) == self.start_pos()
AssertionError

In particular it is an issue with this bit of lean code:

lemma eq_reflection_of_eq_subspace {s s' : affine_subspace ℝ P} [nonempty s]
  [nonempty s'] [complete_space s.direction] [complete_space s'.direction] (h : s = s') (p : P) :
  (reflection s p : P) = (reflection s' p : P) :=
by unfreezingI { subst h }

/-- Reflection is its own inverse. -/
@[simp] lemma reflection_symm (s : affine_subspace ℝ P) [nonempty s] [complete_space s.direction] :
  (reflection s).symm = reflection s :=
rfl

The issue is in two ways to parse the argument of unfreezingI:

Parse the tactic block manually by first matching parentheses and then taking any whitespace and comments.
Use the beginning and end positions of the parameters as provided by Lean.

We do (1) but then check with an assert that it agrees with (2). The reason we do this is that the parsing is quite finicky, so this gives us assurance that we are parsing correctly. Also, there are other times, e.g. conv tactics, where Lean can't give us information about the position of the argument and we just have to parse it using (1). This check makes sure we are using the right method in those cases too.

The problem here is that doc strings shouldn't count toward the parameter (while line comments and normal block comments should).

There are two possible fixes:

Modify the parsing code (around line 851 of parser.py) to not consume all white space. Instead, (similarly to the parsing of other arguments) we should parse itactics as follows:
1. Parse the tactic block (matching parentheses).
2. Test that we are before the end of the Lean-recorded parameter. If not raise an error (not using an assert).
3. Take white space and comment tokens one at a time until we get to the desired end (like the code for parsing other parameters). Raise an error if a non-whitespace token is present.
Remove doc strings from consume_space method. To do that, we would have to change the tokenizer to treat doc strings as different from other block comments. We would also have to probably add doc strings to places in the code where we are looking for a command (e.g. def) since a doc string would mean we are at the end of a proof.

(2) is technically better and ideally it should still be done since it would prevent doc strings from being recorded as part of the tactics, but it is also pedantic and a lot more work. (1) should be sufficient for this problem, especially since all comments including doc strings are stripped out of the training data. I recommend doing (1) which should be fast and easy to implement.

jasonrute / lean_proof_recording Goto Github PK

lean_proof_recording's People

Contributors

Stargazers

Watchers

Forkers

lean_proof_recording's Issues

Lean file permissions have changed

`error: failed to synthesize type class instance for`

Parsing error using |x| local notation in calc tactic

Issues with doc strings when parsing itactic arguments

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent