E 0.82 and EP 0.82

S. Schulz
Institut für Informatik, Technische Universität München, Germany, and ITC/irst, Trento, Italy
schulz@informatik.tu-muenchen.de

Architecture

E 0.82[Sch2002,Sch2004] is a purely equational theorem prover. The core proof procedure operates on formulas in clause normal form, using a calculus that combines superposition (with selection of negative literals) and rewriting. No special rules for non-equational literals have been implemented, i.e., resolution is simulated via paramodulation and equality resolution. The basic calculus is extended with rules for AC redundancy elemination, some contextual simplification, and pseudo-splitting.

E is based on the DISCOUNT-loop variant of the given-clause algorithm, i.e. a strict separation of active and passive facts. Proof search in E is primarily controlled by a literal selection strategy, a clause evaluation heuristic, and a simplification ordering. The prover supports a large number of preprogrammed literal selection strategies, many of which are only experimental. Clause evaluation heuristics can be constructed on the fly by combining various parameterized primitive evaluation functions, or can be selected from a set of predefined heuristics. Supported term orderings are several parameterized instances of Knuth-Bendix-Ordering (KBO) and Lexicographic Path Ordering (LPO).

E uses a preprocessing step to convert formulas in full first order format to clause normal form. Preprocessing also unfolds equational definitions and performs some simplifications on the clause level.

The automatic mode can selects literal selection strategy, term ordering, and search heuristic based on simple problem characteristics of the preprocessed clausal problem.

EP 0.82 is just a combination of E 0.82 in verbose mode and a proof analysis tool extracting the used inference steps.

Implementation

E is implemented in ANSI C, using the GNU C compiler. The most outstanding feature is the global sharing of rewrite steps. Current versions of E add rewrite links from rewritten to new terms. In effect, E is caching rewrite operations as long as sufficient memory is available. Other important features are the use of perfect discrimination trees with age and size constraints for rewriting and unit-subsumption, feature vector indexing[Sch2004b] for forward- and backward subsumption and contextual literal cutting, and a new polynomial implementation of LPO[Loe2004].

The program has been successfully installed under SunOS 4.3.x, Solaris 2.x, HP-UX B 10.20, MacOS-X, and various versions of Linux. Sources of the latest released version are available freely from:

    http://www.eprover.org

EP 0.82 is a simple Bourne shell script calling E and the postprocessor in a pipeline.

Strategies

E's automatic mode is optimized for performance on TPTP 2.6.0. The optimization is based on about 90 test runs over the library (and previous experience) and consists of the selection of one of about 50 different strategies for each problem. All test runs have been performed on SUN Ultra 60/300 machines with a time limit of 300 seconds (or roughly equivalent configurations). All individual strategies are general purpose, the worst one solves about 49% of TPTP 2.6.0, the best one about 60%. E distinguishes problem classes based on a number of features, all of which have between two and 4 possible values. The most important ones are:

Is the most general non-negative clause unit, Horn, or Non-Horn?
Is the most general negative clauce unit or non-unit?
Are all negative clauses unit clauses?
Are all literals equality literals, are some literas equality literals, or is the problem non-equational?
Are there a few, some, or many clauses in the problem?
Is the maximum arity of any function symbol 0, 1, 2, or greater?
Is the sum of function symbol arities in the signature small, medium, or large?

Wherever there is a three-way split on a numerical feature value, the limits are selected automatically with the aim of splitting the set of problems into approximately equal sized parts based on this one feature.

For classes above a threshold size, we assign the absolute best heuristic to the class. For smaller, non-empty classes, we assign the globally best heuristic that solves the same number of problems on this class as the best heuristic on this class does. Empty classes are assigned the globally best heuristic. Typically, most selected heuristics are assigned to more than one class.

Expected Competition Performance

In the last year, E performed well in the MIX category of CASC and came in third in the UEQ division. We believe that E will again be among the strongest provers in the MIX category, in particular due to its good performance for Horn problems. In UEQ, E will probably be beaten only by Waldmeister, and, possibly, E-SETHEO (which incorporates it). We cannot predict performance on FOF problems yet, but hope that E will be competitive.

EP 0.82 will be hampered by the fact that it has to analyse the inference step listing, an operation that typically is about as expensive as the proof search itself. Nevertheless, it should be competitive among the MIX* and FOF* systems.

References

Sch2002: Schulz S. (2002), E: A Brainiac Theorem Prover, Journal of AI Communications 15(2/3), 111-126, IOS Press
Sch2004: Schulz S. (2004), System Abstract: E 0.81, Proceedings of the 3rd IJCAR, (Cork, Ireland), Lecture Notes in Artificial Intelligence, Springer-Verlag
Sch2004b: Schulz S. (2004), Simple and Efficient Clause Subsumption with Feature Vector Indexing, Proceedings of the IJCAR-2004 Workshop on Empirically Successful First-Order Theorem Proving, (Cork, Ireland), ENTCS, Elsevier Science
Loe2004: Löchner b. (2004), What to know when implementing LPO, Proceedings of the IJCAR-2004 Workshop on Empirically Successful First-Order Theorem Proving, (Cork, Ireland), ENTCS, Elsevier Science