Documentation

Mathlib.Tactic.RewriteSearch

The rw_search tactic #

rw_search attempts to solve an equality goal by repeatedly rewriting using lemmas from the library.

If no solution is found, the best sequence of rewrites found before maxHeartbeats elapses is returned.

The search is a best-first search, minimising the Levenshtein edit distance between the pretty-printed expressions on either side of the equality. (The strings are tokenized at spaces, separating delimiters (, ), [, ], and , into their own tokens.)

The implementation avoids completely computing edit distances where possible, only computing lower bounds sufficient to decide which path to take in the search.

Future improvements #

We could call simp as an atomic step of rewriting.

The edit distance heuristic could be replaced by something else.

No effort has been made to choose the best tokenization scheme, and this should be investigated. Moreover, the Levenshtein distance function is customizable with different weights for each token, and it would be interesting to try optimizing these (or dynamically updating them, adding weight to tokens that persistently appear on one side of the equation but not the other.)

The rw_search tactic will rewrite by local hypotheses, but will not use local hypotheses to discharge side conditions. This limitation would need to be resolved in the rw? tactic first.

Separate a string into a list of strings by pulling off initial ( or ] characters, and pulling off terminal ), ], or , characters.

Equations
  • One or more equations did not get rendered due to their size.

Pull off leading delimiters.

Pull off trailing delimiters.

Tokenize a string at whitespace, and then pull off delimiters.

Equations

Data structure containing the history of a rewrite search.

  • mk' :: (
    • The lemmas used so far.

    • The metavariable context after rewriting. We carry this around so the search can safely backtrack.

    • The current goal.

    • type : Lean.Expr

      The type of the current goal.

    • ppGoal : String

      The pretty printed current goal.

    • lhs : List String

      The tokenization of the left-hand-side of the current goal.

    • rhs : List String

      The tokenization of the right-hand-side of the current goal.

    • rfl? : Option Bool

      Whether the current goal can be closed by rfl (or none if this hasn't been test yet).

    • dist? : Option

      The edit distance between the tokenizations of the two sides (or none if this hasn't been computed yet).

  • )

What is the cost for changing a token? Levenshtein.defaultCost just uses constant cost 1 for any token.

It may be interesting to try others. the only one I've experimented with so far is Levenshtein.stringLogLengthCost, which performs quite poorly!

Equations

Check whether a goal can be solved by rfl, and fill in the SearchNode.rfl? field.

Equations
  • One or more equations did not get rendered due to their size.

Fill in the SearchNode.dist? field with the edit distance between the two sides.

Equations
  • One or more equations did not get rendered due to their size.

Represent a search node as string, solely for debugging.

Equations
  • One or more equations did not get rendered due to their size.

Construct a SearchNode.

Equations
  • One or more equations did not get rendered due to their size.

Add an additional step to the SearchNode history.

Equations

Report the index of the most recently applied lemma, in the ordering returned by rw?.

Equations
Equations
  • One or more equations did not get rendered due to their size.

A somewhat arbitrary penalty function. Note that n.lastIdx penalizes using later lemmas from a particular call to rw? at a node, but once we have moved on to the next node these penalties are "forgiven".

(You might in interpret this as encouraging the algorithm to "trust" the ordering provided by rw?.)

I tried out a various (positive) linear combinations of .history.size, .lastIdx, and .ppGoal.length (and also the .log2s of these).

  • .lastIdx.log2 is quite good, and the best coefficient is around 1.
  • .lastIdx / 10 is almost as good.
  • .history.size makes things worse (similarly with .log2).
  • .ppGoal.length makes little difference (similarly with .log2). Here testing consisting of running the current rw_search test suite, rejecting values for which any failed, and trying to minimize the run time reported by
lake build &&  \
time (lake env lean test/RewriteSearch/Basic.lean; \
  lake env lean test/RewriteSearch/Polynomial.lean)

With a larger test suite it might be worth running this minimization again, and considering other penalty functions.

(If you do this, please choose a penalty function which is in the interior of the region where the test suite works. I think it would be a bad idea to optimize the run time at the expense of fragility.)

Equations
@[reducible, inline]

The priority function for search is Levenshtein distance plus a penalty.

Equations
@[reducible, inline]

We can obtain lower bounds, and improve them, for the Levenshtein distance.

Equations

Given a RewriteResult from the rw? tactic, create a new SearchNode with the new goal.

Equations

Given a pair of DiscrTree trees indexing all rewrite lemmas in the imported files and the current file, try rewriting the current goal in the SearchNode by one of them, returning a MLList MetaM SearchNode, i.e. a lazy list of next possible goals.

Equations
  • One or more equations did not get rendered due to their size.
def Mathlib.Tactic.RewriteSearch.SearchNode.search (n : SearchNode) (stopAtRfl stopAtDistZero : Bool := true) (forbidden : Lean.NameSet := ) (maxQueued : Option := none) :

Perform best first search on the graph of rewrites from the specified SearchNode.

Equations
  • One or more equations did not get rendered due to their size.

rw_search attempts to solve an equality goal by repeatedly rewriting using lemmas from the library.

If no solution is found, the best sequence of rewrites found before maxHeartbeats elapses is returned.

The search is a best-first search, minimising the Levenshtein edit distance between the pretty-printed expressions on either side of the equality. (The strings are tokenized at spaces, separating delimiters (, ), [, ], and , into their own tokens.)

You can use rw_search [-my_lemma, -my_theorem] to prevent rw_search from using the names theorems.

Equations
  • One or more equations did not get rendered due to their size.