E 0.8 and EP 0.8
S. Schulz
Institut für Informatik, Technische Universität
München, Germany, and RISC-Linz, Johannes Kepler Universität
Linz, Austria
schulz@informatik.tu-muenchen.de
Architecture
E 0.8[Sch2001,Sch2002] is a purely
equational theorem prover. The calculus used by E combines
superposition (with selection of negative literals) and rewriting. No
special rules for non-equational literals have been implemented,
i.e. resolution is simulated via paramodulation and equality
resolution. E also implements AC redundancy elimination and AC
simplification for dynamically recognized associative and commutative
equational theories, as well as pseudo-splitting for clauses. It now
also unfolds equational definitions in a preprocessing stage.
E is based on the DISCOUNT-loop variant of the given-clause
algorithm, i.e. a strict separation of active and passive facts.
Proof search in E is primarily controlled by a literal selection
strategy, a clause evaluation heuristic, and a simplification
ordering. The prover supports a large number of preprogrammed literal
selection strategies, many of which are only experimental. Clause
evaluation heuristics can be constructed on the fly by combining
various parameterized primitive evaluation functions, or can be
selected from a set of predefined heuristics. Supported term orderings
are several parameterized instances of Knuth-Bendix-Ordering (KBO) and
Lexicographic Path Ordering (LPO).
An automatic mode can select literal selection strategy, term ordering
(different versions of KBO and LPO), and search heuristic based on simple
problem characteristics.
EP 0.8 is just a combination of E 0.8 in verbose mode and a proof
analysis tool extracting the used inference steps.
Implementation
E is implemented in ANSI C, using the GNU C compiler. The most
outstanding feature is the global sharing of rewrite steps. Contrary
to earlier version of E, which destructively changed all shared
instances of a term, the latest version only adds a rewrite
link from the rewritten to the new term. In effect, E is caching
rewrite operations as long as sufficient memory is available. A second
important feature is the use of perfect discrimination trees
with age and size constraints for rewriting and unit-subsumption.
The program has been successfully installed under SunOS 4.3.x,
Solaris 2.x, HP-UX B 10.20, MacOS-X, and various
versions of Linux. Sources of the latest released version and a
current snapshot are available freely from:
http://www4.informatik.tu-muenchen.de/~schulz/WORK/eprover.html
EP 0.8 is a simple Bourne shell script calling E and the postprocessor in
a pipeline.
Strategies
E's automatic mode is optimized for performance on TPTP 2.5.1. The
optimization is based on a fairly large number of test runs and
consists of the selection of one of about 50 different strategies for
each problem. All test runs have been performed on SUN Ultra 60/300
machines with a time limit of 120 seconds (or roughly equivalent
confogurations). All individual strategies are general purpose, the
worst one solves about 45% of TPTP 2.5.1, the best one 55%.
E distinguishes problem classes based on a number of features, all of
which have between two and 4 possible values. These are
- Is the most general non-negative clause unit, Horn, or Non-Horn?
- Is the most general negative clauce unit or non-unit?
- Are all negative clauses unit clauses?
- Are all literals equality literals, are some literas equality
literals, or is the problem non-equational?
- Are there only a few, some, or many positive non-ground
unit clauses among the axioms?
- Are all goals (negative clauses) ground?
- Are there a few, some, or many clauses in the problem?
- Are there a few, some, or many literals?
- Are there a few, some, or many (sub)terms?
- Are there a few, some or many positive ground
unit clauses among the axioms?
- Is the maximum arity of any function symbol 0, 1, 2, or greater?
Wherever there is a selection of few, some, and many of a certain
entity, the limits are selected automatically with the aim of
splitting the set of clauses into three sets of approximately equal
size based on this one feature.
For each non-empty class, we assign the most general candidate
heuristic that solves the same number of problems on this class as the
best heuristic on this class does. Empty classes are assigned the
globally best heristic. Typically, most selected heuristics are
assigned to more than one class.
Expected Competition Performance
In the last year, E performed well in the MIX category of CASC and
came in third in the UEQ division. We believe that E will again be
among the strongest provers in the MIX category, in particular due to
its good performance for Horn problems. In UEQ, E will probably be
beaten only by Waldmeister, and, possibly, E-SETHEO (which
incorporates it).
EP 0.8 will be hampered by the fact that it has to analyse the
inference step listing, an operation that typically is about as
expensive as the proof search itself. Nevertheless, it should be
competitive among the MIX* systems.
References
- Sch2001
- Schulz S. (2001),
System Abstract: E 0.61,
Proceedings of the 1st IJCAR,
(Sienna, Italy),
Lecture Notes in Artificial Intelligence,
Springer-Verlag
- Sch2002
- Schulz S. (2002),
E: A Brainiac Theorem Prover,
Journal of AI Communications 15(2/3), 111-126, IOS Press