NEWSLETTER

No. 51, April 2001

**Contents**

From the AAR President

MathXpert As a Theorem Prover

Woody Bledsoe Student Travel Award

Herbrand Award: Amendments to the Selection Procedure

Conference Calls

New Books

Several months ago I began an active correspondence with Michael Beeson about some of the work we have reported in AAR newsletters and some of the research I am doing in the area of missing proofs. Beeson mentioned his software program MathXpert, and we started a discussion about whether it is or is not a theorem prover. Beeson's article in this newsletter directly addresses that issue. We both welcome comments.

I also call attention to the fact that several new books
have appeared recently
on automated theorem proving and (the name I prefer) automated reasoning.
I remind our readers that the * Journal of Automated Reasoning*
welcomes book reviews;
please submit them to the managing editor, Gail W. Pieper
(pieper@mcs.anl.gov).

Introduction to MathXpert

MathXpert (the software formerly known as Mathpert) is a program designed
to help people learn algebra, trigonometry, and calculus. It contains a
computer algebra system, a logic module called the * prover,* a graphics
module, and some complex code meant to provide a simple user interface. The
program's primary purpose was education, and it is being marketed with that
in mind (see the
Web site). This article will ignore education,
graphics, and user interfaces and will focus on the logical capabilities of
MathXpert. We will attempt to answer the AAR president's question, "Is MathXpert
a theorem prover, or not?"

A MathXpert session begins with the specification of a * topic* (for example,
solving cubic equations, verifying trigonometric identities, or evaluating
integrals--there are about 140 topics altogether). Then the user enters a
problem appropriate for that topic. After the problem is accepted, the next
task is the generation of a step-by-step * solution* of the problem. This can
take place under user control, with the user choosing the steps and the
program executing them. It can also take place in * auto mode,* in which
MathXpert can generate the entire solution. These two modes are roughly
analogous to what we mean when we say * proof checker* and * theorem prover.*
Only auto mode solutions will be discussed in this article, since our topic
is theorem-proving.

Here is a sample solution, the output of MathXpert when the input is the arithmetic-geometric mean inequality.

Is this a proof? I am certain it would be accepted as such in most mathematics classes.

One may object, however, that this is not a * formal* proof; I shall take up below the
question whether there is a formal proof here to be extracted. At the
moment, the only purpose of displaying this example is to fix the ideas as
to what MathXpert actually produces.

The company marketing MathXpert is about to make available MathXpert Online, which neatly separates the auto mode capabilities of MathXpert from all others and makes the auto mode solutions (only) available over the Internet for free. When this service is available, it will be found at the Web site.

** Logic and Correct Computation**

MathXpert makes the connection between logic and computation by maintaining
a list of assumptions. Each (visible) line of computation can be thought of
as depending on a list of assumptions. If we put this list of assumptions
to the left of
, and the visible step *A* to right of
, then we have a
sequent
, of which
is called the * antecedent* and *A* the * succedent.*
When computational steps (operations) are applied, they may have side
conditions (hypotheses). These have to be verified before the operation can
be legitimately applied. This mechanism prevents MathXpert from deriving
false conclusions. More details of how side conditions are handled in this
sequent-calculus framework can be found in [2].

Other computer algebra systems have nothing like this kind of logical underpinnings and hence can easily be led to deduce false conclusions. Witness this example from Mathematica 4.0.2:

The integral in question is divergent, since it has a singularity at the origin. Mathematica 4.0.2. (unlike some earlier versions) correctly notes that fact when the integrand isIn[3]= Integrate[1/Sin[x], {x, -1, 1}] Out[3]= 0

One may well say that correct computation is nice, but it does not make a theorem prover. Of course that is correct, but if we want a theorem prover that can make proofs including computational steps, such as the above example, then the computational steps must be logically correct. Correct computation is a prerequisite to a theorem prover that can take computational steps.

** Kinds of Theorems MathXpert Can Prove**

Most of the topics available in MathXpert begin with a mathematical
expression (such as an integral, limit, derivative, or algebraic
expression), and the solution takes steps that rewrite each line to an
equal expression. The theorem that is proved is that the original problem
equals the last line generated, provided that the assumptions on which the
last line depends hold. (These can be seen in MathXpert by choosing * View*
* Assumptions*.) There are three other types of problems to be considered:
solving equations, verifying identities, and "solving" inequalities. When
solving equations, MathXpert essentially is asked to find the values of *x*
such that *f(x) = 0*, and if it succeeds, it gives an explicit disjunction
that is equivalent to the existential statement. In the case of
trigonometric equations, the solutions are parametrized by new integer
parameters; for example, the solution of
is
.

In MathXpert, to "solve" an inequality for *x* means to reduce the inequality
to a disjunction of inequalities or pairs of inequalities which express the
solution set as a union of intervals. This includes as a special case the
possibility that the answer might be * true,* as in the arithmetic-geometric
mean example above. This means that the inequality has been proved. Unlike
with equations, there is no clear distinction between verifying and solving
an inequality.

** How the Prover Works**

The MathXpert prover can perhaps best be thought of as using rewrite-rule
technology combined with evaluation and ad hoc control of the rewrites. It
is not purely rewrite rules: the simplest example is the rule that permits
rewriting *x ^{3} x^{5}* as

** The Underlying Logic**

Part of the prover has to deal with verifying side conditions inside the
scope of a bound variable (such as in a limit term or definite integral).
In the course of justifying the use of algorithms based on nonstandard
analysis for verifying side conditions in a limit term, I set out a formal
theory that I believe adequately captures the mathematical realm of
MathXpert [1]. The variables can range over integers, real numbers, or
complex numbers. The logic must deal correctly with terms that can be
undefined such as *x*, which is accomplished by the use of the Logic of
Partial Terms (LPT). It must also include term-formation rules for limits,
derivatives, and integrals. It is also possible to use a function variable
in MathXpert, although the vast majority of theorems do not use them.
Formally speaking, it might be simplest to allow them, and to allow the
formation of lambda-terms, because then limits, derivatives, and integrals
can be defined by using lambda-terms. Of course, so can quantifiers then, but
in practice MathXpert proofs are quantifier free.

The axioms of this theory are numerous: some 1600 mathematical operations are implemented in MathXpert. Of course, it is possible to derive these systematically from a small number of axioms. But the proofs MathXpert actually generates can appeal to any of these 1600 "axioms" at any step. In principle it would be possible to write a program that translated MathXpert proofs automatically into proofs in some simple logical theory whose axioms would amount, in essence, to the axioms of ordered fields complete with respect to zeroes of functions defined in the theory, and the definitions of the trigonometric and exponential functions, and closed under derivatives and integrals of functions defined by terms of the theory. Since there are also integer variables, a few axioms about the integers would be required, but MathXpert does not use induction.

** Enabling the Prover to Handle Quantifiers**

I combined MathXpert's logical engine with a very simple theorem prover
that applies the rules of Gentzen's sequent calculus "backwards". This
added the ability to handle quantifiers explicitly, so that, for example,
epsilon-delta proofs could be reduced to the necessary inequalities. This
combination program, known as * Weierstrass,* was able [3] to prove the
epsilon-delta continuity of several specific functions such as *x ^{3}* and
and to prove the irrationality of

** Comparison with Otter**

MathXpert does not search for a proof--it simply uses its rewrite-style engine. In that sense it is much weaker than Otter, which can generate and examine hundreds of thousands of intermediate deductions. It cannot work with arbitrary theories but only with the fixed mathematical framework used in high-school algebra and freshman calculus. It cannot, for example, deduce the ten Knuth-Bendix axioms that form a complete confluent set for group theory---one cannot even ask the question, since it assumes multiplication is commutative (except for matrices).

On the other hand, it is not quite trivial for most theorem provers (including Otter) to prove the arithmetic-geometric mean inequality, say from the axioms of ordered fields with a square-root function. I do not mean to imply that proving that inequality is a remarkable achievement--of course, the arithmetic-geometric mean inequality falls within the decidable realm of real-closed fields, and so can be proved by the well-known CAD decision algorithm. I propose it as an interesting exercise for AAR members to get their favorite theorem-prover to prove it by its usual logical means. For MathXpert it is just another problem.

Because MathXpert does not search for proofs and cannot deal with an arbitrary first-order theory, it is not a traditional theorem prover, but it does produce proofs. MathXpert's strengths, as a theorem-prover, are first, its ability to take steps involving computation, and second, its ability to handle inequalities.

** Notes**

The papers listed below and other papers about MathXpert can be downloaded from
the
Web.
Other relevant links can be found on the author's

home page.

** References**

[1] Beeson, M., Using nonstandard analysis to verify the correctness of
computations, * International Journal of Foundations of Computer Science*
6, no. 3 (1995) 299-338.

[2] Beeson, M., MathXpert: Computer support for learning algebra,
trigonometry, and calculus, in: A. Voronkov (ed.), * Logic Programming and
Automated Reasoning, Lecture Notes in Artificial Intelligence*
624, Springer-Verlag, Berlin (1992).

[3] Beeson, M., Automatic generation of epsilon-delta proofs of continuity,
in: J. Calmet and J. Plaza (eds.), * Artificial Intelligence and Symbolic
Computation, Lecture Notes in Artificial Intelligence* 1476, pp. 67-83,
Springer-Verlag, Berlin (1998).

Call for Nominations

(On behalf of the CADE Inc. Board of Trustees)

The Woody Bledsoe Student Travel Award was created to honor the memory of Woody Bledsoe, for his contributions to mathematics, artificial intelligence, and automated theorem proving, and for his dedication to students. The award is intended to enable selected students to attend the International Conference on Automated Deduction (CADE) or the International Joint Conference on Automated Reasoning (IJCAR), whichever is scheduled for the year, by covering much of their expenses.

In 2001, IJCAR will take place June 18-23 in Siena, Italy (for further information see the Web site. The winners will be reimbursed (to a maximum of US $600) for their conference registration, transportation, and accomodation expenses. Preference will be given to students who do not have alternative funding. While preference will be given to students who will play an active role in the conference, including all attached workshops and events, also students who do not expect to give presentations, including students who have just begun their research in automated deduction or are considering the field, are encouraged to apply.

A nomination consists of a recommendation letter of up to 300 words from
the student's supervisor.
Nominations for IJCAR 2001 should be sent by e-mail to

Fabio Massacci, IJCAR 2001 Conference Chair (ijcar-cch@dii.unisi.it) and

Rajeev Gore, Alexander Leitsch, Tobias Nipkow, IJCAR 2001 Program Co-Chairs
(ijcar-pch@dii.unisi.it),

with copies to

Ulrich Furbach, CADE Inc. President (uli@informatik.uni-koblenz.de) and

Neil V. Murray, CADE Inc. Treasurer (nvm@cs.albany.edu).

Nominations must arrive no later than May 7, 2001, and the winners will be notified by May 14, 2001. The awards will be presented at IJCAR; in case a winner does not attend, the chairs and trustees may transfer the award to another nominee or give no award.

(On behalf of the CADE Inc. Board of Trustees)

The description of the Herbrand Award from the CADE-11 proceedings has been amended by the CADE trustees. The modifications reflect discussions in the board of trustees in May-June 2000 and again in January-February 2001, terminated with a vote approving the new text in February-March 2001. As always, comments and suggestions from the AAR members are welcome.

In previous years the Herbrand Award was decided based on a vote where CADE trustees, members of the program committee, and past recipients of the award received the following ballot:

*You are not asked to rank the N nominees, but simply to answer the
question below for each nominee. Note that the resulting N questions
are independent: the answer to one does not prejudice the answer to
another. When the votes are in, the trustees will make the award to
the nominee receiving the strongest support, provided that nominee
receives "yes" votes from at least 2/3 of the votes cast. (Eligible to
vote are trustees, program committee members, and the previous
recipients of the award). In the event that no nominee satisfies this
condition, no award will be made. In the event that K nominees tie
with maximum support, K awards will be made if they have received at
least 2/3 support.
*

*
If you think several candidates are each worthy of the award, even
though others have been nominated, you may answer "yes" in several
cases.
*

*
If you would prefer the award to go to certain candidates rather than
others, you may answer "yes" for those you prefer and "no" to the
others.
*

*
If you think that none are clearly worthy, you may answer "no" to all.
*

This was followed by the question

*
Do you consider that nominee(s) X is(are) worthy of the
Herbrand Award for Distinguished Contributions to Automated Reasoning?
(YES / NO)
*

for all nominees X.

As this practice had raised a number of criticisms, the trustees sought to modify it according to the following principles:

resulting in the new text below.

Distinguished Contributions to Automated Reasoning

CADE Inc. has established since 1992 the Herbrand Award for Distinguished Contributions to Automated Reasoning, to honor an individual or (a group of) individuals for exceptional contributions to the field of Automated Deduction.

Nominations for this award can be made at any time to the president of CADE Inc. A nomination should include a letter (up to 2000 words) from a principal nominator describing the nominee's contributions, along with two other letters (up to 2000 words) of endorsement. Current members of the board of trustees of CADE Inc. are not eligible.

The CADE president will be responsible for soliciting opinions and evaluations, and carrying out a vote. Nominations will be kept confidential. The nomination of a group of individuals who are collaborators, will be considered as a single nomination.

The program committee, the award winners from the last 10 years, and the current board of trustees of CADE, Inc. will participate in the selection, with the final decision resting with the board of trustees. The award will be given at the CADE or IJCAR conference, whichever is scheduled for the year.

The selection procedure will work as follows. The members of the current program committee, winners from the past 10 years, and trustees are invited to express their opinions on the nominees in two ways:

It is expected that only those receiving an endorsement by a 2/3 majority under vote (1) will be considered by the trustees for the award. A preferred majority candidate will be selected by the single transferrable vote principle using the preferential ranking (2).After possible further discussion among the trustees, the president of CADE proposes a candidate which should be endorsed in a final vote by 2/3 of the trustees. It is expected that in most cases this should be the majority candidate from the advisory vote.

** FLoC'02: Call for Workshop Proposals**

** International Workshop on Functional and (Constraint) Logic Programming**

For further information, see the Web site.

Both "theoretical" papers (max. 15 pages) and "experimental" papers (max. 10 pages) are welcome and are due July 15, 2001. Instructions for submission are on the conference Web page.

** Automated Theorem Proving: Theory and Practice**

- clear, concise presentation of the fundamentals
- extensive end-of-chapter exercises, with solutions
- HERBY, an excellent and unique semantic-tree theorem-proving program
- THEO, a strong resolution-refutation
theorem-proving program*approximately 200 theorems included on the CD-ROM
for experimentation

The book and software are an excellent text/reference for advanced students, practitioners, and professionals in computer science, applied math, logical computation, and artificial intelligence. Anyone with an interest in automated reasoning will find the book an essential guide and hands-on tool for learning about the theorem-proving process.

232 pages

Springer, 2000

ISBN 0387950753

US$55

244 pages

Springer, 2001

ISBN 3-211-83241-6

US$69

** Symbolic Computation and Automated Reasoning:
The CALCULEMUS-2000 Symposium**

288 pages

A. K. Peters, Ltd., 2001

ISBN: 1-56881-145-4

US $60

Web site