E 0.82 and EP 0.82
S. Schulz
Institut für Informatik, Technische Universität
München, Germany, and ITC/irst, Trento, Italy
schulz@informatik.tu-muenchen.de
Architecture
E 0.82[Sch2002,Sch2004] is a purely
equational theorem prover. The core proof procedure operates on
formulas in clause normal form, using a calculus that combines
superposition (with selection of negative literals) and rewriting. No
special rules for non-equational literals have been implemented,
i.e., resolution is simulated via paramodulation and equality
resolution. The basic calculus is extended with rules for AC
redundancy elemination, some contextual simplification, and
pseudo-splitting.
E is based on the DISCOUNT-loop variant of the given-clause
algorithm, i.e. a strict separation of active and passive facts.
Proof search in E is primarily controlled by a literal selection
strategy, a clause evaluation heuristic, and a simplification
ordering. The prover supports a large number of preprogrammed literal
selection strategies, many of which are only experimental. Clause
evaluation heuristics can be constructed on the fly by combining
various parameterized primitive evaluation functions, or can be
selected from a set of predefined heuristics. Supported term orderings
are several parameterized instances of Knuth-Bendix-Ordering (KBO) and
Lexicographic Path Ordering (LPO).
E uses a preprocessing step to convert formulas in full first order
format to clause normal form. Preprocessing also unfolds equational
definitions and performs some simplifications on the clause level.
The automatic mode can selects literal selection strategy, term
ordering, and search heuristic based on simple problem characteristics
of the preprocessed clausal problem.
EP 0.82 is just a combination of E 0.82 in verbose mode and
a proof analysis tool extracting the used inference steps.
Implementation
E is implemented in ANSI C, using the GNU C compiler. The most
outstanding feature is the global sharing of rewrite steps. Current
versions of E add rewrite
links from rewritten to new terms. In effect, E is caching
rewrite operations as long as sufficient memory is available. Other
important features are the use of perfect discrimination trees
with age and size constraints for rewriting and unit-subsumption,
feature vector indexing[Sch2004b]
for forward- and backward subsumption and contextual literal cutting,
and a new polynomial implementation of LPO[Loe2004].
The program has been successfully installed under SunOS 4.3.x,
Solaris 2.x, HP-UX B 10.20, MacOS-X, and various
versions of Linux. Sources of the latest released version are
available freely from:
http://www.eprover.org
EP 0.82 is a simple Bourne shell script calling E and the postprocessor in
a pipeline.
Strategies
E's automatic mode is optimized for performance on TPTP 2.6.0. The
optimization is based on about 90 test runs over the library (and
previous experience) and consists of the selection of one of about 50
different strategies for each problem. All test runs have been
performed on SUN Ultra 60/300 machines with a time limit of 300
seconds (or roughly equivalent configurations). All individual
strategies are general purpose, the worst one solves about 49% of TPTP
2.6.0, the best one about 60%.
E distinguishes problem classes based on a number of features, all of
which have between two and 4 possible values. The most important ones
are:
- Is the most general non-negative clause unit, Horn, or Non-Horn?
- Is the most general negative clauce unit or non-unit?
- Are all negative clauses unit clauses?
- Are all literals equality literals, are some literas equality
literals, or is the problem non-equational?
- Are there a few, some, or many clauses in the problem?
- Is the maximum arity of any function symbol 0, 1, 2, or greater?
- Is the sum of function symbol arities in the signature small,
medium, or large?
Wherever there is a three-way split on a numerical feature value,
the limits are selected automatically with the aim of
splitting the set of problems into approximately equal
sized parts based on this one feature.
For classes above a threshold size, we assign the absolute best
heuristic to the class. For smaller, non-empty classes, we assign the
globally best heuristic that solves the same number of problems on
this class as the best heuristic on this class does. Empty classes are
assigned the globally best heuristic. Typically, most selected
heuristics are assigned to more than one class.
Expected Competition Performance
In the last year, E performed well in the MIX category of CASC and
came in third in the UEQ division. We believe that E will again be
among the strongest provers in the MIX category, in particular due to
its good performance for Horn problems. In UEQ, E will probably be
beaten only by Waldmeister, and, possibly, E-SETHEO (which
incorporates it). We cannot predict performance on FOF problems yet,
but hope that E will be competitive.
EP 0.82 will be hampered by the fact that it has to analyse the
inference step listing, an operation that typically is about as
expensive as the proof search itself. Nevertheless, it should be
competitive among the MIX* and FOF* systems.
References
- Sch2002
- Schulz S. (2002),
E: A Brainiac Theorem Prover,
Journal of AI Communications 15(2/3), 111-126, IOS Press
- Sch2004
- Schulz S. (2004),
System Abstract: E 0.81,
Proceedings of the 3rd IJCAR,
(Cork, Ireland),
Lecture Notes in Artificial Intelligence,
Springer-Verlag
- Sch2004b
- Schulz S. (2004),
Simple and Efficient Clause Subsumption with Feature
Vector Indexing,
Proceedings of the IJCAR-2004 Workshop on Empirically
Successful First-Order Theorem Proving, (Cork, Ireland),
ENTCS, Elsevier Science
- Loe2004
- Löchner b. (2004),
What to know when implementing LPO,
Proceedings of the IJCAR-2004 Workshop on Empirically
Successful First-Order Theorem Proving, (Cork, Ireland),
ENTCS, Elsevier Science