Digging through the forward_call code, I made two observations: 1.)

With this modification I get 75 copies of: <div class="snippet-clipboard-content n

Now try it with this: <div class="snippet-clipboard-content notranslate position-r

It is a bit tricky for me to test prove_call_setup_aux</code

Some observations on `forward_call` performance,about princetonuniversity/vst

Comments (21)

andrew-appel commented on September 25, 2024

Regarding change_compspecs, can you try the following easy experiment?

Ltac change_compspecs cs ::=
 match goal with |- context [?cs'] =>
   match type of cs' with compspecs =>
     time "a" (try (constr_eq cs cs'; fail 1));
     time "b" change_compspecs' cs cs';
     time "c" repeat change_compspecs' cs cs'
   end
end.

This will tell us which part is taking the 10 or 20 seconds. I conjecture you will see 60 or 80 instances of the "a" part, each taking 1/4 of a second, and none of the "b" or "c". Then I will think about what to do next.

from vst.

MSoegtropIMC commented on September 25, 2024

With this modification I get 75 copies of:

Tactic call a ran for 0. secs (0.u,0.s) (success)
Tactic call b ran for 0.139 secs (0.104u,0.002s) (failure)
Tactic call a ran for 0. secs (0.u,0.s) (failure after 1 backtracking)

where the time for b varies between 120ms and 142ms. The sequence is abaabaaba..., so I have 75 calls of b and 150 of a.

from vst.

andrew-appel commented on September 25, 2024

Now try it with this:

Ltac prove_call_setup_aux  ts witness ::=
 let H := fresh "SetupOne" in
 intro H;
 match goal with | |- @semax ?CS _ _ (PROPx ?P (LOCALx ?L (SEPx ?R'))) _ _ =>
 let Frame := fresh "Frame" in evar (Frame: list mpred); 
 let cR := (fun R =>
 exploit (call_setup2_i _ _ _ _ _ _ _ _ R R' _ _ _ _ ts _ _ _ _ _ _ _ H witness Frame); clear H;
 simpl functors.MixVariantFunctor._functor;
 [ try_convertPreElim
 | check_prove_local2ptree
 | check_vl_eq_args
 | auto 50 with derives
 | check_gvars_spec
 | let lhs := fresh "lhs" in 
   match goal with |- ?A |-- ?B => pose (lhs := A); change (lhs |-- B) end;
   try change_compspecs CS; subst lhs;
   cancel_for_forward_call
 |
 ])
  in strip1_later R' cR
 end.

from vst.

andrew-appel commented on September 25, 2024

I am slightly surprised by the fact that "a" succeeded and "b" took some time to fail. The next step would be


Ltac change_compspecs' cs cs' ::=
  lazymatch goal with
  | |- context [@data_at cs' ?sh ?t ?v1] => time "A1" erewrite (@data_at_change_composite cs' cs _ sh t); [| time "A2" apply JMeq_refl | prove_cs_preserve_type]
  | |- context [@field_at cs' ?sh ?t ?gfs ?v1] => time "B1" erewrite (@field_at_change_composite cs' cs _ sh t gfs); [| time "B2" apply JMeq_refl | prove_cs_preserve_type]
  | |- context [@data_at_ cs' ?sh ?t] => time "C1" (erewrite (@data_at__change_composite cs' cs _ sh t); [| time "C2" prove_cs_preserve_type]
  | |- context [@field_at_ cs' ?sh ?t ?gfs] => time "D1" (erewrite (@field_at__change_composite cs' cs _ sh t gfs); [| time "D2" prove_cs_preserve_type]
  | |- _ => time "E"
    match goal with 
  | |- context [?A cs'] => 
     idtac "Warning: attempting change_compspecs on user-defined mpred:" A;
         change (A cs') with (A cs)
  | |- context [?A cs' ?B] => 
     idtac "Warning: attempting change_compspecs on user-defined mpred:" A;
         change (A cs' B) with (A cs B)
  | |- context [?A cs' ?B ?C] => 
     idtac "Warning: attempting change_compspecs on user-defined mpred:" A;
         change (A cs' B C) with (A cs B C)
  | |- context [?A cs' ?B ?C ?D] => 
     idtac "Warning: attempting change_compspecs on user-defined mpred:" A;
         change (A cs' B C D) with (A cs B C D)
  | |- context [?A cs' ?B ?C ?D ?E] => 
     idtac "Warning: attempting change_compspecs on user-defined mpred:" A;
         change (A cs' B C D E) with (A cs B C D E)
  | |- context [?A cs' ?B ?C ?D ?E ?F] => 
     idtac "Warning: attempting change_compspecs on user-defined mpred:" A;
         change (A cs' B C D E F) with (A cs B C D E F)
   end
 end.

from vst.

andrew-appel commented on September 25, 2024

I can confirm that the new version of prove_call_setup_aux (with the extra "pose (lhs:=A)" logic) passes all the CI tests locally. It is intended to greatly reduce the number of SEP clauses that need to be examined by change_compspecs. If you can confirm that overriding prove_call_setup_aux (as in my "now try it with this" message) actually makes things significantly faster for you, then I can commit this version in time for the next release 2.13 that will go in the November 2023 Coq Platform.
Independently from overriding prove_call_setup_aux, the experiment in my message "I am slightly surprised" will shed some light on what's taking so long.

from vst.

MSoegtropIMC commented on September 25, 2024

It is a bit tricky for me to test prove_call_setup_aux because at that level I replaced everything with Ltac2 code. I can leave some goals open and experiment on them using Ltac1, though. I also can't easily use the original high level Ltac1 code, because I have some wrapper code around it which automatically constructs the witness, which is also Ltac2 and the interface to the original Ltac1 code is gone (I could dig it out from git ...).

With the modified change_compspecs' (I fixed some typos, see below) I get 75 copies of:

Tactic call a ran for 0. secs (0.u,0.s) (success)
Tactic call A1 ran for 0. secs (0.u,0.s) (failure)
Tactic call b ran for 0. secs (0.u,0.s) (failure)
Tactic call a ran for 0. secs (0.u,0.s) (failure after 1 backtracking)

and the overall time went from 14.278s down to 4.5 in one case.

The code I actually used is:

  Ltac change_compspecs' cs cs' ::=
  lazymatch goal with
  | |- context [@data_at cs' ?sh ?t ?v1] => time "A1" erewrite (@data_at_change_composite cs' cs _ sh t); [ time "A2" apply JMeq_refl | prove_cs_preserve_type]
  | |- context [@field_at cs' ?sh ?t ?gfs ?v1] => time "B1" erewrite (@field_at_change_composite cs' cs _ sh t gfs); [| time "B2" apply JMeq_refl | prove_cs_preserve_type]
  | |- context [@data_at_ cs' ?sh ?t] => time "C1" erewrite (@data_at__change_composite cs' cs _ sh t); [ time "C2" prove_cs_preserve_type]
  | |- context [@field_at_ cs' ?sh ?t ?gfs] => time "D1" erewrite (@field_at__change_composite cs' cs _ sh t gfs); [| time "D2" prove_cs_preserve_type]
  | |- _ => time "E"
    match goal with 
  | |- context [?A cs'] => 
     idtac "Warning: attempting change_compspecs on user-defined mpred:" A;
         change (A cs') with (A cs)
  | |- context [?A cs' ?B] => 
     idtac "Warning: attempting change_compspecs on user-defined mpred:" A;
         change (A cs' B) with (A cs B)
  | |- context [?A cs' ?B ?C] => 
     idtac "Warning: attempting change_compspecs on user-defined mpred:" A;
         change (A cs' B C) with (A cs B C)
  | |- context [?A cs' ?B ?C ?D] => 
     idtac "Warning: attempting change_compspecs on user-defined mpred:" A;
         change (A cs' B C D) with (A cs B C D)
  | |- context [?A cs' ?B ?C ?D ?E] => 
     idtac "Warning: attempting change_compspecs on user-defined mpred:" A;
         change (A cs' B C D E) with (A cs B C D E)
  | |- context [?A cs' ?B ?C ?D ?E ?F] => 
     idtac "Warning: attempting change_compspecs on user-defined mpred:" A;
         change (A cs' B C D E F) with (A cs B C D E F)
   end
 end.

from vst.

andrew-appel commented on September 25, 2024

If you can't use my version of prove_call_setup_aux directly, then let me explain the point of the change that I made, and perhaps you can apply that idea to your Ltac2 version.
The purpose of change_compspecs is this: Suppose in foo.c I have imported struct/union definitions ("composite specifications") s1,s2,s3; and in bar.c I have imported struct/union definition s2,s6,s7. Then they share s2 in common. Now suppose the caller has a data_at that uses struct s2, and the callee's precondition requires a data_at that also uses struct s2. Each data_at is implicitly parameterized by the compspecs of foo.c or bar.c, and we need to reinterpret the function's precondition (which was originally written in the compspecs of bar.c) in the compspecs of the caller (foo.c). This is done by change_compspecs, which will fail if (for example) they have a different definition of s2 and if struct s2 is mentioned in the funspec's precondition.

At forward_call, the subgoal that is proved by try change_compspecs CS; cancel_for_forward_call looks like this:

current_assertion |-- function-precondition * Frame

where Frame is basically an evar (that is, a local definition wrapped around an e-var). The current implementation of prove_call_setup_aux applies change_compspecs to this entire entailment. But we expect that only the function-precondition will need adjustment to its compspecs. Furthermore, if the caller has 60 SEP conjuncts in its assertion, then current_assertion is much bigger than function-precondition.

So my solution (with set (lhs := A)) is to hide the current_assertion in a local definition, so that change_compspecs does not operate upon it. change_compspecs will operate only on the function-precondition, which may be much smaller.

Perhaps you can apply this idea in your Ltac2 version.

from vst.

andrew-appel commented on September 25, 2024

Regarding another point in your message beginning "It is a bit tricky": I don't fully understand why erewrite (@data_at_change_composite cs' cs _ sh t) fails, and why that failure does not hurt anything else. We should only get to that point if the entailment current_assertion |-- function_precondition * Frame contains a data_at with a different (implicit) compspecs parameter than the caller's. So it would be interesting to determine exactly what conjunct it is, and whether it's in the current_assertion or in the function_precondition.

from vst.

MSoegtropIMC commented on September 25, 2024

Thank you for the detailed explanation! My plan with Ltac2 is to analyse the situation and then execute exactly those tactics which are required to solve the goal. Floyd currently does a lot of analysis by trying tactics, which I would say is the main reason for avoidable slowness. Such an analysis is much easier to do in Ltac2 than in Ltac1, since in Ltac2 I can more easily keep some context information in Ltac2 data structures. Your explanations are very helpful for this. If I understand you right, I need to do the comspec change only in the situation that I have identical types declared in different compilation units - this should be easy enough to detect.

In order to help you improve the Ltac1 code - which will also help me since my plan is to do in Ltac2 only the majority of the function calls and use your tactics for the rest - it is probably best if I create a simple example of this - it shouldn't be hard. I expect that all it takes is a largish number of local variables. I should have something from previous discussions we had on Floyd performance with large number of variables.

from vst.

MSoegtropIMC commented on September 25, 2024

One more note on this: there is another call to change_compspecs in

Ltac after_forward_call :=
        check_POSTCONDITION;
        try
        match goal with
          | |- context [ remove_localdef_temp ] => simplify_remove_localdef_temp
          end; unfold_app; try (apply extract_exists_pre; intros _);
        match goal with
        | |- semax _ _ _ _ => idtac
        | |- unit -> semax _ _ _ _ => intros _
        end; match goal with
              | |- semax _ _ _ _ => try change_compspecs CS
              end; repeat (apply semax_extract_PROP; intro); cleanup_no_post_exists;
        abbreviate_semax; try fwd_skip

so I have the ~ 15s twice in each forward-call. I see if I can find a fast solution in Ltac2.

from vst.

andrew-appel commented on September 25, 2024

My latest commit c0025ba addresses this issue in after_forward_call. The idea is to limit the change_compspecs to only those conjuncts in the funspec-postcondition, and not traverse the user's current assertion (the frame).

from vst.

andrew-appel commented on September 25, 2024

Pull request #728 (now merged) includes, among other things, improvements to forward_call and after_forward_call to speed things up (in some cases) by applying change_compspecs only to the funspec and not to the current assertion and frame.

from vst.

MSoegtropIMC commented on September 25, 2024

@andrew-appel : the following analysis on why this takes a while might still be interesting.

I used the following definition (with ::= to override the original definition):

Ltac change_compspecs cs ::=
  time (
  idtac "change_compspecs A" cs;
  match goal with |- context [?cs'] =>
   match type of cs' with compspecs =>
     idtac "change_compspecs B" cs;
     try (constr_eq cs cs'; idtac "change_compspecs C1" cs cs'; fail 1);
     idtac "change_compspecs C2" cs cs';
     change_compspecs' cs cs';
     repeat change_compspecs' cs cs'
   end
  end).

Then I did run a single forward_call - your plain ltac1 with just the prove_local2ptree replaced with my ltac2 version (the version before your fix).

The output is:

change_compspecs A CompSpecs
75 copies of 
    change_compspecs B CompSpecs
    change_compspecs C1 CompSpecs CompSpecs
Tactic call ran for 4.546 secs (3.42u,0.032s) (failure)
change_compspecs A CompSpecs
75 copies of 
    change_compspecs B CompSpecs
    change_compspecs C1 CompSpecs CompSpecs
Tactic call ran for 4.744 secs (3.569u,0.034s) (failure)

So what really takes time is the context matching or the constr_eq. It never goes into handling the case that the compspecs are different. Just the two checks if this might be so took 9s (with 75 locals - a slightly different case than the one I analysed before where it took 14s - it is harder for me to call the plain ltac1 tactic in this more complicated case.

from vst.

MSoegtropIMC commented on September 25, 2024

P.S.: I also wrapped the constr_eq into a time like this:

  Ltac change_compspecs cs ::=
  time (
  idtac "change_compspecs A" cs;
  match goal with |- context [?cs'] =>
   match type of cs' with compspecs =>
     idtac "change_compspecs B" cs;
     time (try (constr_eq cs cs'; idtac "change_compspecs C1" cs cs'; fail 1));
     idtac "change_compspecs C2" cs cs';
     change_compspecs' cs cs';
     repeat change_compspecs' cs cs'
   end
  end).

the output of these time tacticals is always 0 - so the time is spent in the context matching.

from vst.

andrew-appel commented on September 25, 2024

Thank you. But I still suspect that with my most recent changes to forward_call and after_forward_call (in my just-merged p.r.), this will be greatly reduced in your examples, because I believe that most of the conjuncts in your examples are in the frame, not in the precondition/postcondition of the funspec, so now they won't be examined at all.

from vst.

MSoegtropIMC commented on September 25, 2024

I am not so sure. I expect the time is simply proportional to the overall goal size - including all the terms hidden in implicit arguments. In the end you match on every possible sub term of the goal and check if it has type compspecs. Assuming that restarting the context match is reasonably efficient, it shouldn't depend much on how many compspecs you find.

Of course your PR does reduce the goal size - I just have no intuition on how much.

I will try your PR.

from vst.

MSoegtropIMC commented on September 25, 2024

I am btw. struggling to find a minimum reproduction example. When I have 80 local variables which are structs, so that I have 80 data_at_ in the context right after start_function, the time for a forward_call is about 4..5s. Looking at the above analysis I guess I have just a lot of other stuff around which makes my goal larger.

Shall I send you the 5s case?

from vst.

MSoegtropIMC commented on September 25, 2024

Btw.: since the match goes over all subterms of the goal, this should increase more than linear with the goal size.

from vst.

andrew-appel commented on September 25, 2024

Here's an interesting experiment, testing how long it takes to do constr_eq A A where A is a large term (not just a definition-name standing for a large term).

Inductive tree := L : tree | N : tree -> tree -> tree.

Fixpoint f n :=
 match n with
 | O => L
 | S n' => N (f n') (f n')
 end.

Time Definition c := Eval compute in f 23.

Goal False.

time "A" constr_eq c c.
let x := constr:(c) in let y := eval hnf in x in time "B" constr_eq y y.
let x := constr:(c) in let y := eval compute in x in time "C" constr_eq y y.
let x := constr:(c) in let y := eval hnf in x in time "E" unify y y.

Results:

Finished transaction in 11.684 secs (11.304u,0.379s) (successful)
Tactic call A ran for 0. secs (0.u,0.s) (success)
Tactic call B ran for 0. secs (0.u,0.s) (success)
Tactic call C ran for 0. secs (0.u,0.s) (success)
Tactic call E ran for 0.903 secs (0.874u,0.028s) (success)

What we learn from this is:

constr_eq A B takes constant time if A and B are the same exact term, even if it's a large term.
unify A B takes linear time if A and B are the same exact term. That's surprising; I would have thought that Coq's unification algorithm would do a constr_eq check to avoid doing lots of work.

from vst.

MSoegtropIMC commented on September 25, 2024

Yes - my measurements also showed that constr_eq is very fast. So the problem is more likely this construct:

match goal with |- context [?cs'] =>
   match type of cs' with compspecs

I expect that the second match is executed for every subterm of the goal. This can be quite a large number. The constr_eq ist the only called for tersm with have the correct type, which is a much smaller number.

from vst.

andrew-appel commented on September 25, 2024

Another solution, which would avoid the match type of cs' with compspecs search, is each funspecs should have an additional, explicit compspecs component, indicating what compspecs can be expected to be found in the precondition conjuncts and postcondition conjuncts. But before I consider such a major change, I'd like to see whether you notice any speed improvements from my PR.

from vst.

Some observations on `forward_call` performance about vst HOT 21 CLOSED

Comments (21)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent