Giter Site home page Giter Site logo

data61 / psl Goto Github PK

View Code? Open in Web Editor NEW
62.0 9.0 9.0 176.2 MB

License: Other

Isabelle 68.19% TeX 0.57% Standard ML 26.65% Shell 1.07% OCaml 3.52%
proof psl isabelle proof-strategies conjecture hol machine-learning artificial-intelligence itp interactive-theorem-proving

psl's Introduction

News

  • We updated this repository to Isabelle2023.
  • NEW: The introduction of Abduction Prover. You can watch a demo of Abduction Prover in our YouTube channel.
  • LiFtEr and Smart_Induct are no-longer supported, since their successors, SeLFiE and sem_ind, have shown superior performance.
  • PaMpeR is currently not supported either, since we want to minimise the cost necessary to maintain this repository.
  • This is the development version of PSL, SeLFiE, and sem_ind where we try out possibly immature ideas. In case you find problems, please send your feedback.
  • The main developer of this repository, Yutaka, has taken a full-time position at a private company, and he works on PSL and all that only in his spare time. Therefore, the progress of this project, unfortunately, will be slow for the foreseeable future. In case you find problems and requests about data61/PSL, contact Yutaka (email: united.reasoning+gmail.com (reaplace + with @), twitter: YutakangE) or open an issue.

Smart_Isabelle

This repository contains various tools to support interactive theorem proving in Isabelle/HOL using artificial intelligence. This repository contains the implementation of proof strategy language (PSL) and its default strategy, try_hard, for Isabelle2023. Past versions of Isabelle, such as Isabelle2022-1, are no longer supported.

YouTube

We opened a YouTube channel to introduce aspects of this project.

Video Thumbnail

Installation (of SeLFiE, PSL, and sem_ind in one go) (for MacOS/Lunux users)

  1. Install Isabelle2023.
  2. Download or clone this repository (git clone https://github.com/data61/PSL.git).
  3. Open Isabelle/jEdit with PSL and all that. You can do this by opening Isabelle/jEdit as following:
    • (path to the Isabelle binary)isabelle jedit -d (path to the directory that contains this README file) -l Smart_Isabelle
    • If you are a MacOS user and your current directory is this one with this README.md, probably you should type something like this in Terminal:
    • /Applications/Isabelle2023.app/bin/isabelle jedit -d . -l Smart_Isabelle
  4. Then, You can use SeLFiE/PSL/sem_ind to your theory files with the Isabelle keyword, imports as imports "Smart_Isabelle.Smart_Isabelle".
  5. Open Example/Example.thy to see if the installation is successful.

Note on installation for Windows users

The basic steps are the same as MacOS and Linux. However, instead of using the binary file directly, use Isabelle2023\Cygwin-Terminal in Command Prompt. Once you start Isabelle2023\Cygwin-Terminal, you can install our tools by typing isabelle jedit -d (path to the directory that contains this README file) -l Smart_Isabelle. Note that once you started Isabelle2023\Cygwin-Terminal, you should not specify the path to the Isabelle binary file. Therefore, the command you need after starting Isabelle2023\Cygwin-Terminal is something like isabelle jedit -d . -l Smart_Isabelle, assuming that your current directory is this one with this README.md/

Screenshot

If you find it difficult to install our tool, please refer to the Isabelle System Manual. Alternatively, you can just send an email to Yutaka at united.reasoning+gmail.com (reaplace + with @).

Hints

PSL's runtime tactic generation can result in a large number of messages in Isabelle/jEdit's output panel. This might cause Isabelle/jEdit to pause PSL's proof search after reaching its default upper limit for tracing messages.

  • One can circumvent this situation by changing the upper limit to an extreamly large number, say 99999999.
  • One can change the upper limit for tracing messages via jEdit's menus: Plugins => Plugin Options => Isabelle => General => Editor Tracing Messages. Screenshot

Documentations

We published academic papers describing the ideas implemented in this project.

  • A Proof Strategy Language and Proof Script Generation for Isabelle/HOL at CADE2017 explains the overall idea of PSL. (arXiv/Springer)
  • Goal-Oriented Conjecturing for Isabelle/HOL at CICM2018 explains the conjecturing framework implemented as Generalize and Conjecture in PSL/PGT. (arXiv/Springer)
  • PaMpeR: Proof Method Recommendation System for Isabelle/HOL at ASE2018 explains the proof method recommendation system implemented in PSL/PaMpeR. (arXiv/ACM) Note that PaMpeR is currently not supported to minimise the cost to maintain this repository.
  • LiFtEr: Language to Encode Induction Heuristics for Isabelle/HOL at APLAS2019 explains our domain specific language to encode induction heuristics. (arXiv/Springer)
  • smart_induct: Smart Induction for Isabelle/HOL (Tool Paper) accepted at FMCAD2020. (TU Wien Academic Press/Zenodo/YouTube.)
  • Simple Dataset for Proof Method Recommendation in Isabelle/HOL (Dataset Description) at CICM2020. (arXiv/Springer)
  • sem_ind: Faster Smarter Proof by Induction in Isabelle/HOL at IJCAI2021 explains how sem_ind predicts how to apply proof by induction. (IJCAI/YouTube)
  • SeLFiE: Definitional Quantifiers Realise Semantic Reasoning for Proof by Induction at TAP2022 explains the idea and interpreter of SeLFiE, which we developed to implement sem_ind. (arXiv/Springer)
  • TBC: Template-Based Conjecturing for Automated Induction in Isabelle/HOL at FSEN2023. (arXiv/Springer)

We presented the final goal of this project at AITP2017. Our position paper "Towards Smart Proof Search for Isabelle" is available at arXiv.

We are currently developing MeLoId, an extension to PSL to improve its proof automation. Our position paper "Towards Machine Learning Induction" is available at arXiv. We presented this abstract at AITP2019, LiVe2019, and ML2019.

We also plan to improve the proof automation using evolutionary computation. We presented our plan during the poster session at GECCO2019. Our poster-only paper is available at ACM digital library and arXiv.

Preferred Citation

  • PSL: Nagashima, Y., Kumar, R. (2017). A Proof Strategy Language and Proof Script Generation for Isabelle/HOL. In: de Moura, L. (eds) Automated Deduction – CADE 26. CADE 2017. Lecture Notes in Computer Science(), vol 10395. Springer, Cham. https://doi.org/10.1007/978-3-319-63046-5_32

  • PGT: Nagashima, Y., Parsert, J. (2018). Goal-Oriented Conjecturing for Isabelle/HOL. In: Rabe, F., Farmer, W., Passmore, G., Youssef, A. (eds) Intelligent Computer Mathematics. CICM 2018. Lecture Notes in Computer Science(), vol 11006. Springer, Cham. https://doi.org/10.1007/978-3-319-96812-4_19

  • PaMpeR: Yutaka Nagashima and Yilun He. 2018. PaMpeR: proof method recommendation system for Isabelle/HOL. In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering (ASE 2018). Association for Computing Machinery, New York, NY, USA, 362–372. DOI:https://doi.org/10.1145/3238147.3238210

  • Towards Evolutionary Theorem Proving for Isabelle/HOL: Yutaka Nagashima. 2019. Towards evolutionary theorem proving for Isabelle/HOL. In Proceedings of the Genetic and Evolutionary Computation Conference Companion (GECCO ’19). Association for Computing Machinery, New York, NY, USA, 419–420. DOI:https://doi.org/10.1145/3319619.3321921

  • LiFtEr: Nagashima, Y. (2019). LiFtEr: Language to Encode Induction Heuristics for Isabelle/HOL. In: Lin, A. (eds) Programming Languages and Systems. APLAS 2019. Lecture Notes in Computer Science(), vol 11893. Springer, Cham. https://doi.org/10.1007/978-3-030-34175-6_14

  • Simple Dataset Nagashima Y. (2020) Simple Dataset for Proof Method Recommendation in Isabelle/HOL. In: Benzmüller C., Miller B. (eds) Intelligent Computer Mathematics. CICM 2020. Lecture Notes in Computer Science, vol 12236. Springer, Cham. https://doi.org/10.1007/978-3-030-53518-6_21

  • Smart Induction Yutaka Nagashima. Smart Induction for Isabelle/HOL (Tool Paper). In: Ivrii A., Strichman O. (eds) Proceedings of the 20th Conference on Formal Methods in Computer-Aided Design – FMCAD 2020 DOI:https://doi.org/10.34727/2020/isbn.978-3-85448-042-6_32

  • sem_ind Yutaka Nagashima. Faster Smarter Proof by Induction in Isabelle/HOL. Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence Main Track. Pages 1981-1988 DOI:https://doi.org/10.24963/ijcai.2021/273

  • Definitional Quantifier and SeLFiE Nagashima, Y. (2022). Definitional Quantifiers Realise Semantic Reasoning for Proof by Induction. In: Kovács, L., Meinke, K. (eds) Tests and Proofs. TAP 2022. Lecture Notes in Computer Science, vol 13361. Springer, Cham. https://doi.org/10.1007/978-3-031-09827-7_4

  • Template-Based Conjecturing Nagashima, Y., Xu, Z., Wang, N., Goc, D.S., Bang, J. (2023). Template-Based Conjecturing for Automated Induction in Isabelle/HOL. In: Hojjat, H., Ábrahám, E. (eds) Fundamentals of Software Engineering. FSEN 2023. Lecture Notes in Computer Science, vol 14155 . Springer, Cham. https://doi.org/10.1007/978-3-031-42441-0_9

Screenshots

PSL example

Screenshot

Abduction Prover example

Screenshot

psl's People

Contributors

coda-coda avatar keisrk avatar yutakang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

psl's Issues

intersperse incorrect :)

fun intersperse _ [] = []

Should be something like:

(* Exercise 2.8 *)
fun intersperse  :: "'a ⇒ 'a list ⇒ 'a list" where
  "intersperse  a [] = []"|
  "intersperse  a [x1] = [x1]"|
  "intersperse  a (x1 # (x2 # xs)) = x1 # (a # (intersperse  a (x2 # xs)))"

PaMpeR: efficient recommendation lookup

When looking up a recommendation in regression trees, we do not have to apply assertions that are not used in any regression tree. Identify them, and skip them when providing recommendations to users.

PaMpeR: More assertions.

The following proof methods appeared in the ITP2018 evaluation results, but PaMpeR failed to produce valuable recommendations. However, I think I can write good assertions for them.

  • algebra -> used at least 43 times in the AFP
  • abstraction -> not used in the AFP? (e1ff4ad)
  • noiminal_induct (10d3461)
  • cut_tac
  • presburger -> variables of type Nat?
  • sos -> used more than 50 time in the AFP (cfd9584)
  • ind_cases -> used about 38 times in the AFP -> (8188c6c)
  • vector -> used more than 30 times. (97b228b)
  • atomize_elim -> (552fead) Maybe checking the existence of meta universal quantifier in the premise of meta implication would be good as well.
  • hypsubst_thin
  • satx
  • vcg (7401fb1)
  • approximation -> used about 5 times in the AFP (cfd9584)
  • linarith -> about 550 times in the AFP. (7c2650e)

The following proof methods appear in Isabelle's standard library. Maybe I can do something about them.

  • countable_datatype -> used twice in the AFP (a28ddc6).
  • corec_unique
  • cooper -> not used in the AFP
  • descending -> does not appear in AFP often. (439324f)
  • disentangle -> does not appear in AFP.
  • dlo -> not used in the AFP.
  • enabled -> not used in the AFP
  • ensures_tac -> not used in the AFP
  • fixrec_simp -> used about 102 times in the AFP (1c2c34c)
  • ferrack -> not used in the AFP
  • field -> used about 124 times in the AFP (7401fb1)
  • hoare -> used more than 90 times in the AFP (466cbca)
  • hypsubst -> used about 158 times in the AFP (053ff5f)
  • invariant -> not used in the AFP
  • lifting -> used 6 times in the AFP (439324f)
  • mir -> not used in the AFP. (f20bc6a)
  • mkex_induct -> not used in the AFP. (f367037)
  • ns_induct -> not used in the AFP.
  • oghoare -> at least 13 times in the AFP. (5680e6c)
  • pair -> probably not used in the AFP (dcd7735).
  • pair_induct -> probably not used in the AFP (dcd7735).
  • parts_prepare -> not used in the AFP
  • perm_simp -> used about 150 times in the AFP (This is defined outside of Isabelle's standard library -> out of scope for now)
  • rawsat -> not used in the AFP
  • reify -> probably not used in the AFP, but I found reify_floatariths
  • reflection -> probably not used in the AFP
  • rename_client_map
  • rewrite -> used at least 102 times in the AFP
  • ring -> used in the AFP
  • rferrack -> not used in the AFP
  • sat -> used at least 49 times in the AFP
  • Seq_case -> not used in the AFP (dcd7735)
  • Seq_case_simp -> not used in the AFP (dcd7735)
  • Seq_Finite_induct -> not used in the AFP (dcd7735)
  • Seq_induct -> not used in the AFP (dcd7735)
  • size_change -> used about three times in the AFP
  • split_idle
  • vcg_simp -> used in the AFP (7401fb1)
  • valid_certificate_tac
  • measurable (3266396)

MiLkMaId: overwrite apply, by, and proof command

These overwritten commands should

  • Ignore combined proof methods.
  • Reuse some programs from PaMpeR.
  • Be combined Dynamic (Induct) in PSL.
  • List all observationally equivalent combinations of list of results of assertions.
    • The data format should be something along the line of
    • ( line number, proof method, [ [ bool ] ] )
  • Store each data point in a separate file to handle concurrent writes.
    • I can use lock.

MiLkMaId: `P (Q x y) (Q x y) = Q (P x y) (P x y)`

Suppose we have the following proof obligation: P (Q x y) (Q x y) = Q (P x y) (P x y) and

  • P is defined in recursion in the first parameter and
  • Q is defined in recursion in the second parameter.

on which variable/term should one conduct mathematical induction?

  • x in P x y
  • y in Q x y
  • Q x y in P (Q x y) (Q x y)
  • P x y in Q (P x y) (P x y)

MiLkMaId: location of assertion methods.

Since MiLkMaId assertions have to be able to access values of the type modifier within Smart_Induct, one cannot define those assertions in modules before Smart_Induct. Since we want to call these assertion methods from Smart_Induct, these assertions have to be defined before or within Smart_Induct.

This means Smart_Induct is the only place one can define these assertions. However, I would like to keep Smart_Induct as close as Dynamic_Induct.

Having a submodule within Smart_Induct would be a decent compromise?

PaMpeR: better data-structure to store assertions.

Currently, all assertions are stored in a list, and they are indexed according to the number in the list. This makes it very hard to remove or add new assertions without affecting other parts because assertions' numbering is used in the assert_nth_true and assert_nth_false commands.

PaMpeR: more stable database extraction

Two possible approaches:

  1. create a database for each assertion.
    • pros: we can find out which assertion causes problems.
    • pros: Isabelle is less likely to run out of memory since for each build we use only one assertion.
    • cons: we have to repeat the normal operations of each proof method many times (= the number of assertions.)
    • cons: Building the entire AFP took about 24 hours on 4-8 CPU machines (100 - 200 CPU hours). So, if we have 3000 assertions it becomes 300,000 - 600,000 hours = 12500 CPU days.
      • Maybe not that bad. Assertions about the constants themselves are pretty inexpensive. We can run 2000 of them at once and be more careful about other assertions that involve context-lookup.
    • cons: probably automatically generated assertions are similar to each other when it comes to the chance of making Isabelle crash. (maybe not. The reason Isabelle crashes is )
    • cons: preprocessing becomes trickier.
  2. create a database for each theory file/session.
    • pros: we do not have to repeat the same computation many times.
    • cons: it's hard to find out which assertion makes Isabelle stops, when one of many assertions causes problems.
    • cons: probably Isabelle is more likely to run out of memory if we use 3000 assertions in each apply/by command.

PaMpeR: distinguish constant names and type names when applying assertions.

Currently, we have many assertions that

  • first extracts term names and type names that appear in the proof obligation at hand
  • check if the underlying proof context has theorems of names that have these type names or constant names as substring with certain strings that indicate the type of the theorem.

For example, has_thms_with_suffixes in Assert_Util does this.
However, we probably should distinguish constant names and type names more explicitly.

PaMpeR: Pruning of regression trees.

In the evaluation results for ITP2018, it was clear that PaMpeR's recommendations for coinduct and coinduction was suffering form over-fitting.

We can fix this by pruning based on cross-validation.

MiLkMaId: apply assertions to all combinations of inductions generated by PSL

In stead of constructing database using hand-written proof scripts only, we should check all the inductions produced by PSL and consider those inductions observationally equivalent to the human written induction as the right choices.

This is an enhancement. We should work on this issue only after we manage to construct a database from the human written proof scripts.

-> No. Actually, it is easier if we work on the PSL generated induction methods directly because we can handle arguments using the data-type (modifier) defined by me.

So, assertions should take the triple of (pre_thm, modifier, post_thm).

And probably it should be implemented in Smart_Induct.ML.

un-de-bruijn-indexed un-curried term

To improve Dynamic (Induct_tac), we should be able to produce the string representation of bound variables Bound of int.

We need a function trm_to_udbi_trm that maps a term to a term that has the type of UDBI_Bound of (string * int). udbi and UDBI stand for "Un-De-Bruijn Index", in which de-Bruijn indices are tagged with the variable names representing those bound variables corresponding to each de-Bruijn index.

We also need a function that maps udbi_trm to uc_udbi_trm, where uc_udbi_trm stands "Un-Curried Un-De-Bruijn-Indexed term".

So, we need

  • val trm_to_udbi_trm: term -> udbi_trm, and

  • val udbi_trm_to_uc_udbi_trm: udbi_trm -> uc_udbi_trm, where

  • type udbi_trm = UDBI_Bound of (string * int) (and similar to term for other cases), and

  • type uc_udbi_trm = UC_UDBI_App of (uc_udbi_trm * uc_udbi_trm list) (and similar to udbi_trm for other cases).

This becomes useful only when we want to predict arguments of the induct_tac method. As long as we work on the induct method, this is not a serious issue.

Printing sub-terms correctly is not a trivial task. See this discussion, for example. For the moment, I give up producing a string that represents a sub-term that includes bound variables that are bound outside this sub-term.

PSL: re-design the syntax

re-design the syntax of PSL: users should be able to specify how tactics are generated during proof search in more detail.

UR: assertion to guess the appropriate number of `back` after `rule meta_allI`

Probably, we can guess how many times we should use the back command after apply (rule meta_allI) by counting the numbers of constants appearing in the first sub-goals before and after back.

Note that after using the back command appropriate time, the number of constants in the first sub-goal should be larger than the number of constants in the first-subgoal before applying the back command by one.

PaMpeR: re-implement the proof_advice command

Given a proof obligation (PO) and its surrounding proof context and state, the proof_advice command

  1. convert the PO and its surrounding context and state into an array of boolean.
  2. investigate how likely an ideal proof engineer uses each method for proof obligations abstracted into the same array, by looking up the corresponding branch in the decision tree for each method.
  3. pick up the 5 most promising methods.

MiLkMaId: names instead of de-Bruijn index.

Having constant names like, "point = {cname = "0", level = 3, utyp = UB}}", does not help machine learning algorithms to recommend how to apply mathematical induction much.

In case, none of the induct, induction, and induct_tac methods can handle mathematical induction on lambda abstracted variables, this question is not very meaningful. However, I guess that induct_tac can handle such cases.

PaMpeR: nan when computing RSS

When no-assertion belongs to a set, the current implementation returns NaN probably because it tries to divide a real number by 0.0.

PaMpeR: use chained facts for more assertions

When PaMpeR looks up relevant derived lemmas in proof contexts, often it only uses constants and types that appear in the first subgoal. But probably PaMpeR should also use constants and types that appear in chained facts.

Probably, this is one reason the tree shape for the transfer method in the ITP2018 version is a bit strange.

PaMpeR: new command "why_method"

When invoked within a proof script with a name of a proof method, why_method should return

  • the expectation of an ideal proof engineer to use that method for that proof obligation in that context, and
  • qualitative explanation as to why PaMpeR provides the above expectation by printing the messages tagged with each assertion used to reach that node in the corresponding decision tree.

PaMpeR: semi-automated feature extraction

Currently, PaMpeR's feature vector is of fixed length (60).

We should produce more assertions automatically.

For example, we have the following three assertions:

  1. the first subgoal has a constant that is defined in List.
  2. the outermost constant of the first subgoal is HOL.All.
  3. the underlying proof context has a pinduct rule related to a constant in the first subgoal.

They should be automatically generated because

  1. we have finite number of constants defined in Isabelle's standard library.
  2. we have finite theory files in Isabelle's standard library.
  3. we have finitely many postfixes in theorems in Isabelle's standard library.

PSL, MiLkMaId, LiFtEr: arbitrary x and y

Currently, PSL does not create a version of induction method that uses the "and" keyword between two "arbitrary" variables.

This limits the power of PSL.

See, for example, this entry to the Isabelle mailing list.

PaMpeR: all assertions about the outermost constant are broken.

The check in this line always returns NONE.

So, assertions always return 0.

A quick (but not so robust) fix would be to use Term.add_const_names instead.

Usually the last or second from the last are what users see as the outermost constants.

If what users see is a constant from meta-logic, we should check the last term name,
if what users see is a constant defined in HOL or user defined constant, we should check the second from the last.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.