E 0.7 and EP 0.7
S. Schulz
Institut für Informatik, Technische Universität
München, Germany, and Safelogic A.B., Gothenburg, Sweden
schulz@informatik.tu-muenchen.de
Architecture
E 0.62[Sch2001,Sch2002] is a purely
equational theorem prover. The calculus used by E combines
superposition (with selection of negative literals) and rewriting. No
special rules for non-equational literals have been implemented,
i.e. resolution is simulated via paramodulation and equality
resolution. E also implements AC redundancy elimination and AC
simplification for dynamically recognized associative and commutative
equational theories, as well as pseudo-splitting for clauses.
E is based on the DISCOUNT-loop variant of the given-clause algorithm,
i.e. a strict separation of active and passive facts. Proof search in E is
primarily controlled by a literal selection strategy, a clause evaluation
heuristic, and a simplification ordering. The prover supports a large number of
preprogrammed literal selection strategies, many of which are only
experimental. Clause evaluation heuristics can be constructed on the fly by
combining various parameterized primitive evaluation functions, or can be
selected from a set of predefined heuristics. Supported term orderings are
several parameterized instances of Knuth-Bendix-Ordering (KBO) and Lexicographic
Path Ordering (LPO).
An automatic mode can select literal selection strategy, term ordering
(different versions of KBO and LPO), and search heuristic based on simple
problem characteristics.
EP 0.7 is just a combination of E 0.7 in verbose mode and a proof
analysis tool extracting the used inference steps.
Implementation
E is implemented in ANSI C, using the GNU C compiler. The most
outstanding feature is the global sharing of rewrite steps. Contrary
to earlier version of E, which destructively changed all shared
instances of a term, the latest version only adds a rewrite
link from the rewritten to the new term. In effect, E is caching
rewrite operations as long as sufficient memory is available. A second
important feature is the use of perfect discrimination trees
with age and size constraints for rewriting and unit-subsumption.
The program has been successfully installed under SunOS 4.3.x,
Solaris 2.x, HP-UX B 10.20, and various versions of
Linux. Sources of the latest released version and a current snapshot are
available freely from:
http://wwwjessen.informatik.tu-muenchen.de/~schulz/WORK/eprover.html
EP 0.7 is a simple Bourne shell script calling E and the postprocessor in
a pipeline.
Strategies
E's automatic mode is optimized for performance on TPTP 2.4.0. The
optimization is based on a fairly large number of test runs and
consists of the selection of one of about 50 different strategies for
each problem. All test runs have been performed on SUN Ultra 60/300
machines with a time limit of 300 second (or roughly equivalent
confogurations).
E distinguishes four primary problem classes: Problems where all
non-negative clauses are unit clauses, where all non-negative clauses
are Horn clauses (and at least one is not a unit), and non-Horn
problems without and with equality. For each of these classes a
separate set of candidate heuristics is created and run over all
problems in this class. To generate the automatic mode for any of the
4 primary classes, the class is partitioned into subclasses defined by
(some) of the following 9 features:
- Are all negative clauses unit clauses?
- Are all literals equality literals, are some literas equality
literals, or is the problem non-equational?
- Are there only a few, some, or many positive non-ground
unit clauses among the axioms?
- Are all goals (negative clauses) ground?
- Are there a few, some, or many clauses in the problem?
- Are there a few, some, or many literals?
- Are there a few, some, or many (sub)terms?
- Are there a few, some or many positive ground
unit clauses among the axioms?
- Is the maximum arity of any function symbol 0, 1, 2, or greater?
Wherever there is a selection of few, some, and many of a certain
entity, the limits are selected automatically with the aim of
splitting the set of clauses into three sets of approximately equal
size based on this one feature.
For each non-empty class, we assign the most general candidate
heuristic that solves the same number of problems on this class as the
best heuristic on this class does. Typically, most selected heuristics
are assigned to more than one class. As an example, the current
intermediate version uses a total of 8 heuristics to cover all 532
problems with unit axioms.
Expected Competition Performance
In the last year, E performed well in the MIX category of CASC and
came in third in the UEQ division. We believe that E will again be
among the strongest provers in the MIX category, in particular due to
its good performance for Horn problems. In UEQ, E will probably be
beaten only by Waldmeister.
EP 0.7 will be hampered by the fact that it has to analyse the
inference step listing, an operation that typically is about as
expensive as the proof search itself. Nevertheless, it should be
competitive among the MIX* systems.
References
- Sch2001
- Schulz S. (2001),
System Abstract: E 0.61,
Proceedings of the 1st IJCAR,
(Sienna, Italy),
Lecture Notes in Artificial Intelligence
Springer-Verlag
- Sch2002
- Schulz S. (2002),
E: A Brainiac Theorem Prover,
Journal of AI Communications, to appear.