E 0.7 and EP 0.7

S. Schulz
Institut für Informatik, Technische Universität München, Germany, and Safelogic A.B., Gothenburg, Sweden
schulz@informatik.tu-muenchen.de

Architecture

E 0.62[Sch2001,Sch2002] is a purely equational theorem prover. The calculus used by E combines superposition (with selection of negative literals) and rewriting. No special rules for non-equational literals have been implemented, i.e. resolution is simulated via paramodulation and equality resolution. E also implements AC redundancy elimination and AC simplification for dynamically recognized associative and commutative equational theories, as well as pseudo-splitting for clauses.

E is based on the DISCOUNT-loop variant of the given-clause algorithm, i.e. a strict separation of active and passive facts. Proof search in E is primarily controlled by a literal selection strategy, a clause evaluation heuristic, and a simplification ordering. The prover supports a large number of preprogrammed literal selection strategies, many of which are only experimental. Clause evaluation heuristics can be constructed on the fly by combining various parameterized primitive evaluation functions, or can be selected from a set of predefined heuristics. Supported term orderings are several parameterized instances of Knuth-Bendix-Ordering (KBO) and Lexicographic Path Ordering (LPO).

An automatic mode can select literal selection strategy, term ordering (different versions of KBO and LPO), and search heuristic based on simple problem characteristics.

EP 0.7 is just a combination of E 0.7 in verbose mode and a proof analysis tool extracting the used inference steps.

Implementation

E is implemented in ANSI C, using the GNU C compiler. The most outstanding feature is the global sharing of rewrite steps. Contrary to earlier version of E, which destructively changed all shared instances of a term, the latest version only adds a rewrite link from the rewritten to the new term. In effect, E is caching rewrite operations as long as sufficient memory is available. A second important feature is the use of perfect discrimination trees with age and size constraints for rewriting and unit-subsumption.

The program has been successfully installed under SunOS 4.3.x, Solaris 2.x, HP-UX B 10.20, and various versions of Linux. Sources of the latest released version and a current snapshot are available freely from:

    http://wwwjessen.informatik.tu-muenchen.de/~schulz/WORK/eprover.html

EP 0.7 is a simple Bourne shell script calling E and the postprocessor in a pipeline.

Strategies

E's automatic mode is optimized for performance on TPTP 2.4.0. The optimization is based on a fairly large number of test runs and consists of the selection of one of about 50 different strategies for each problem. All test runs have been performed on SUN Ultra 60/300 machines with a time limit of 300 second (or roughly equivalent confogurations). E distinguishes four primary problem classes: Problems where all non-negative clauses are unit clauses, where all non-negative clauses are Horn clauses (and at least one is not a unit), and non-Horn problems without and with equality. For each of these classes a separate set of candidate heuristics is created and run over all problems in this class. To generate the automatic mode for any of the 4 primary classes, the class is partitioned into subclasses defined by (some) of the following 9 features:

Are all negative clauses unit clauses?
Are all literals equality literals, are some literas equality literals, or is the problem non-equational?
Are there only a few, some, or many positive non-ground unit clauses among the axioms?
Are all goals (negative clauses) ground?
Are there a few, some, or many clauses in the problem?
Are there a few, some, or many literals?
Are there a few, some, or many (sub)terms?
Are there a few, some or many positive ground unit clauses among the axioms?
Is the maximum arity of any function symbol 0, 1, 2, or greater?

Wherever there is a selection of few, some, and many of a certain entity, the limits are selected automatically with the aim of splitting the set of clauses into three sets of approximately equal size based on this one feature.

For each non-empty class, we assign the most general candidate heuristic that solves the same number of problems on this class as the best heuristic on this class does. Typically, most selected heuristics are assigned to more than one class. As an example, the current intermediate version uses a total of 8 heuristics to cover all 532 problems with unit axioms.

Expected Competition Performance

In the last year, E performed well in the MIX category of CASC and came in third in the UEQ division. We believe that E will again be among the strongest provers in the MIX category, in particular due to its good performance for Horn problems. In UEQ, E will probably be beaten only by Waldmeister.

EP 0.7 will be hampered by the fact that it has to analyse the inference step listing, an operation that typically is about as expensive as the proof search itself. Nevertheless, it should be competitive among the MIX* systems.

References

Sch2001: Schulz S. (2001), System Abstract: E 0.61, Proceedings of the 1st IJCAR, (Sienna, Italy), Lecture Notes in Artificial Intelligence Springer-Verlag
Sch2002: Schulz S. (2002), E: A Brainiac Theorem Prover, Journal of AI Communications, to appear.