Theorem Proving Using Semantic Resolution

For those of you into propositional logic, Anthony offers working code (in C) for experimenting with "mechanical" theorem proving.

April 01, 1988
URL:http://www.drdobbs.com/cpp/theorem-proving-using-semantic-resolutio/184407930

Figure 1

Figure 2

Figure 1

Figure 2

APR88: THEOREM PROVING USING SEMANTIC RESOLUTION

THEOREM PROVING USING SEMANTIC RESOLUTION

Resolving those nasty semantic clashes with C

Anthony J. Dos Reis

Anthony J. Dos Reis is an assistant professor at the College at New Paltz, Dept. of Mathematics and Computer Science, New Paltz, NY 12561.

One expects the resolution principle--a simple rule of inference useful in theorem proving--to date back to antiquity, along with modus ponens and Euclid's elements. In fact, though, it was discovered as recently as 1965 by JA. Robinson and constituted a major breakthrough in mechanical theorem proving. It is surprising that such a powerful and (it seems now) obvious result did not surface much earlier.

Since 1965, the resolution principle has undergone many refinements. One such refinement--semantic resolution--is discussed in this article. To avoid becoming mired in the complexities of formal logic, I will restrict my attention to propositional logic. Although the power of propositional logic is limited, it is nevertheless an ideal vehicle for presenting the essential aspects of semantic resolution. (See Listing One page 64.)

Basic Definitions

I will use the symbols & v, ->, and to represent the logical operations conjunction, disjunction, implication, and negation, respectively.

An expression (called the conclusion) is said to logically follow from a set of expressions (called the premises) if the conclusion is true whenever the premises are all true. A conclusion with its premises is called an argument. An argument is valid if its conclusion logically follows from its premises. The conclusion in a valid argument is called a theorem.

A literal is a variable or the negation of a variable. A clause is either a single literal or a disjunction of literals or is empty (denoted by the symbol ). Examples of clauses are , p, q, p v q and p v q v r. A variable and its negation (for example, p, p) is called a complementary pair. The assignment of a truth value to every variable under consideration is called an interpretation. A set of clauses is unsatisfiable if for every interpretation at least one clause in the set is false.

Expressions with identical truth table values are said to be equivalent. It can be shown that any logical expression can be transformed to an equivalent expression consisting of either a single clause or a conjunction of clauses. For example, the expression (p v q) r can be converted to the equivalent expression ( p v r) & ( q v r). Thus, without loss of generality, we can restrict our attention to logical expressions of this standard form. In fact, we can restrict our attention to individual clauses because a conjunction can be viewed as several individual premises or conclusions. For example, the premise ( p v r) & ( q v r) can be viewed as two premises--( p v r) and ( q v r)--each a single clause.

To prove that a conclusion logically follows from its premises in propositional logic, you simply need to construct a truth table and determine if the conclusion meets the definition of logically follows. Although this approach is eminently satisfactory for propositional logic, it unfortunately falls apart in more powerful logic systems (such as firstorder logic) in which truth tables may not be of finite size. For such systems rules of inference are used to establish whether a conclusion logically follows from its premises. A rule of inference is a rule that describes how to generate a new expression that logically follows from existing expressions. The step-by-step derivation of the conclusion from the premises using rules of inference is called a proof.

Typically, you need many rules of inference to generate the expressions needed in a proof. For example, the rule of inference modus ponens given by:

      
E1 -> E2 
E1      
E2

cannot be used to show that p logically follows from p -> q and q because the premises are not of the appropriate form. Modus ponens by itself is not complete--that is, there are valid arguments whose conclusions cannot be generated using modus ponens alone.

A serious problem with having many rules of inference is that it is necessary to select the appropriate rule at each step in a proof. The appropriate rule is, in general, not obvious. You could, of course, try every rule at every step, but such a brute-force approach is very inefficient in time and space.

The Resolution Principle

The resolution principle is a rule of inference given by the four forms shown in Table 1, below.

Table 1: The four forms of resolution


(i)               (ii)              (iii)            (iv)
E1 v E2           E1 v E2           E1                E1
E1 v E3      E1          E1 v E3       E1
E2 v E3          E2             E3

Resolution requires two clauses (the clauses to be resolved) with a complementary pair--that is, one clause should contain some expression E1 and the other E1. The resolvent--that is, the clause generated--is formed from the expressions that remain after E1 and E1 are thrown away. If more than one expression remains (as in the first form in Table 1), the resolvent is the disjunction of these expressions. If no expressions remain (as in the a fourth form in Table 1), the resolvent is the empty clause, denoted by . Because disjunction is commutative, the order of the expressions is not important.

If two clauses that both contain some expression E are resolved, the resolvent need not have two occurrences of E because E v E always equals E. For example, you can take the resolvent of p v q v r and p v q to be q v r rather than q v q v r.

If clauses that have more than one complementary pair are resolved, only one complementary pair is eliminated. For example, if you resolve p v q and p v q, you get either q v q or p v p depending on which complementary pair you eliminate. Here the resolvent is always a tautology (that is, an expression that is always true) and therefore logically follows from any set of premises. In refutation proofs (see later), such resolutions are useless and should never be performed.

In the following example, 6 through 9 are obtained by resolution. Each is labeled with the line numbers of the expressions from which it is derived.

1    p v   q v r
2    p v q v s
3     p
4     r
5     s
6    p v r v s     (1,2)
7    p v r         (5,6)
8    r             (3,7)
9                (4,8)

The rule of inference modus ponens given in clause form:

E1 v E2 E1 E2

is clearly a special case of the resolution principle. Other rules of inference--modus to tollens, hypothetical syllogism, disjunctive syllogism, constructive dilemma, and destructive dilemma--are also special cases of resolution. That resolution encompasses all these rules is an indication of its power and generality.

Proof by Refutation

Suppose you wish to show that the argument:

P1 P2 . . premises . Pn C conclusion

is valid--that is, C logically follows from P1, P2..., Pn. The straightforward approach is to derive C from the premises using rules of inference. Another approach--called proof by refutation--is to negate the conclusion and then derive the empty clause from the premises and the negated conclusion. I'll first look at an example and then discuss why this approach works.

To show that the following argument is valid:

      p v r
      r v t      premises
       p

T conclusion

first negate the conclusion and add it to the premises. Then using rules of inference, derive the empty clause:

1    p v r
2    r v t
3     p
4    t    negation of the conclusion
5     r        (1,3)
6    r        (2,4)
7            (5,6)

To understand why this approach works, first note that the generation of the empty clause (line 7) requires the resolution of an expression and its negation, r and r. Now assume that the initial clauses (1, 2, 3 and 4) are all true. Then both r and r must also be true because they are derived from the initial clauses using a rule of inference. But because r and r always have opposite truth values, they cannot both be true. The assumption--that the initial clauses are all true--must be false. Thus, whenever clauses 1, 2, and 3 are all true, clause 4 (the negation of the conclusion) is false and the conclusion is true.

Although proof by refutation seems like a roundabout way of proving a theorem, it is actually more suitable for machine implementation than the straightforward approach for two reasons. First, the goal of the refutation approach is an empty clause, which is generally easier to derive because it may be obtained from any one of perhaps several complementary pairs. For example, another possibility for the previous proof is to derive t and t, as follows:

1    p v r
2    r v t
3     p
4    t        negation of the conclusion
5     r           (1,3)
6     t           (2,5)
7               (4,6)

Second and even more important, resolution is complete when used in a refutation proof--that is, if the original argument is valid, then the empty clause can always be derived using only the resolution principle. Our bagful of inference rules is not needed.

Completeness of Resolution

The basic idea in the proof of the completeness of the resolution principle can be illustrated by considering a simple example. Suppose the set of clauses from which the empty clause is to be derived is { p, p v q, q}. Construct a binary tree as follows:

1. Start with an unlabeled root node.

Now repeat steps 2 and 3 for each variable V that appears among the set of clauses (in this example, steps 2 and 3 will be done twice--once for V=p and once for V=q):

2. Extend the tree from any unlabeled leaf node by adding two branches that terminate on two new nodes, the left and right of which represent the assignment of true and false, respectively, to V.

3. If the truth assignment associated with all the nodes in the path from the root to a node added in step 2 makes a clause false, then label that node with that clause. (Labeled nodes are called failure nodes.)

The tree constructed corresponding to this example is shown in Figure 1, page 51. Observe that every leaf node must be a failure node because otherwise an interpretation would exist that satisfies the set of clauses. Moreover, at least one node (called an inference node) must have two failure nodes immediately below it. Because the truth assignments associated with these failure nodes differ for exactly one variable, theIr corresponding clauses must contain exactly one complementary pair and are therefore resolvable. Furthermore, the resolvent is false (in the sense of step 3 above) for the inference node. Thus, the inference node can be labeled with the resolvent (making it a failure node) and the two nodes below it pruned. Thus, you get the tree shown in Figure 2, page 55.

Figure 1: Binary tree

Figure 2: Binary tree after resolution

The same argument can be applied repeatedly, resulting in further pruning (and resolution) until only the root node remains, corresponding to the derivation of the empty clause.

Semantic Resolution

In a resolution proof, the clauses to be resolved must be selected at each step. In general, there may be many resolvable clauses that produce resolvents that are not required for the proof. For example, consider:

1    p v q
2    p v  q
3     p v q
4     p v  q
5    p                 (1,2)
6     p               (3,4)
7                    (5,6)

The resolvents (1,3), (2,4), (1,6), (2,6), (3,5), and (4,5) are not needed for the proof and are, therefore, wasteful to generate. A mechanical theorem prover, however, unless otherwise constrained, would typically generate such resolvents. For resolution to be practical, it must be restricted from generating too many unnecessary resolvents. The restrictions used, however, must still allow the necessary resolvents to be generated, or else the technique would not be complete.

One approach to restricting resolution that is complete is called semantic resolution. Semantic resolution applies two restrictions to resolution: one using an interpretation; the other, an ordering. Recall that an interpretation is the assignment of a truth value to each variable. In the following set of clauses, for example:

1    p v q
2    p v  q
3     p v q
4     p v  q

p can be assigned true and q false. With this particular interpretation, clauses 1, 2, and 4 all evaluate to true, and clause 3 evaluates to false. For a given interpretation, a clause that evaluates to true is called a nucleus (denoted by a +), and a clause that evaluates to false is called an electron (denoted by a -). An ordering is simply a listing, in descending priority, of the variables in a set of clauses. A single interpretation and a single ordering should be used for the entire proof. The particular interpretation and ordering used is not critical, however--any will work.

The two restrictions of semantic resolution are:

1. Never resolve a nucleus with a nucleus.

2. Resolve an electron with a nucleus only if the variable to be eliminated has the highest priority among the variables that appear in the electron.

An example should make these two rules clear. For an interpretation, assign true to p and q and take <p, q> as the ordering. Now consider the following clauses:

1    p v q          (+)
2    p v  q        (+)
3     p v q        (+)
4     p v  q      (-)

The possible resolvents are (1,2), (1,3), (2,4), and (3,4). By restriction 1, (1,2) and (1,3) should not be generated because both are nuclei. By restriction 2, (3,4) should not be generated because the variable p appears in the electron and has higher priority than q, the variable eliminated. Thus, the only allowable resolvent is (2,4):

5     q               (2,4),(-)

With clause 5 added to the list, the only possibilities now are (1,5) and (3,5):

6    p                 (1,5),(+)
7     p               (3,5),(-)

Now the possibilities are (4,6), (1,7), (2,7), and (6,7):

8     q               (4,6),(-)
9    q                 (1,7),(+)
10    q               (2,7),(-)
11                   (6,7)

Two additional points should be made about semantic resolution. First, if an interpretation makes the clauses all true or all false, then the original argument is not valid and, therefore, cannot be proved. Second, an electron can never be resolved with an electron because resolution requires a pair of clauses with complementary expressions, one of which must be true regardless of the interpretation used. Thus, one of the clauses is always a nucleus.

Semantic Clashes

Consider the following proof using semantic resolution with true assigned to p q and r as the interpretation and with <p, q, r> as the ordering. All the resolvents that can be generated under semantic resolution are listed.

1    p v  q v r               (+)
2    q                         (+)
3     p v  q                 (-)
4     r                       (-)
5     q v r              (1,3)(+)
6    p v  q              (1,4)(+)
7     q                  (4,5)
8     q                  (3,6)(-)
9                          (2,7)

The true literals, p and r, in the nucleus on line 1 can be eliminated, respectively, by the electron on line 3 and the electron on line 4. Clauses 1, 3, and 4 constitute a semantic clash--that is, a single nucleus together with a set of electrons that eliminate all its true literals under semantic resolution. A semantic clash contains exactly one electron for each true literal in its nucleus.

The nucleus and electrons in a semantic clash can be resolved in the following manner. First, resolve the nucleus with any electron in the semantic clash. Then resolve the resulting resolvent with another electron. Continue resolving electrons with successive resolvents until all the electrons are used. For example, for the semantic clash (1,3,4) above, you get:

1    p v  q v r            (+)
2    q                      (+)
3     p v  q              (-)
4     r                    (-)
5     q v r           (1,3)(+) intermediate resolvent
6     q               (4,5)(-) final resolvent

The final resolvent of a semantic clash is always an electron or the empty claus--never a nucleus.

The importance of resolving semantic clashes is that the intermediate resolvents are never needed to produce the empty clause and therefore can be omitted. Fewer unnecessary resolvents are generated, resulting in a more efficient proof. The previous proof with nine lines, redone with semantic clashes, becomes a proof with only six lines:

1    p v  q v r    (+)
2    q  (+)
3     p v  q      (-)
5     q  (1,3,4)   (-)  semantic clash
6        (2,5)         semantic clash

A semantic clash does not necessarily exist for every nucleus. For example, a semantic clash does not exist for clause 2 above until clause 5 is generated. However, because resolution restricted to semantic clashes is complete (for a proof, see Chang and Lee), there should always be at least one semantic clash until the empty clause is generated.

Bibliography

Chang, C.; and Lee, R.C. Symbolic Logic and Mechanical Theorem Proving. New York, N.Y.: Academic Press, 1973.

Winston, P.H. Artificial Intelligence. 2d ed. Englewood Cliffs, NJ.: Addison-Wesley, 1984.

[LISTING ONE]






     /*      A Theorem Prover for Propositional Logic
      *
      *      This program uses the resolution principle restricted to
      *  semantic clashes to prove theorems in propositional logic.
      *  The premises and the negation of the conclusion must be entered
      *  using 1, 0, and -1 to represent, respectively, the presence of a
      *  variable in true form, the absence of a variable, and the
      *  presence of a variable in negated form.  For example, a run
      *  in which the following four clauses are used
      *
      *       p v q v  r
      *            q
      *      p v q
      *                r
      *
      *  would look as follows:
      *
      *  enter number of clauses and variables
      *  4 3
      *  enter clauses
      *  1  -1  1
      *  0   1  0
      * -1  -1  0
      *  0   0 -1
      *
      *  The program would then respond with
      *
      *  initial clauses
      *    1 -1  1      (1)
      *    0  1  0      (2)
      *   -1 -1  0      (3)
      *    0  0 -1      (4)
      *  resolvents
      *    0 -1  0      (5 from 1, 3, 4)
      *    0  0  0      (6 from 2, 5)
      *  proof complete--empty clause generated
      *
      *      For each nucleus, the program searches for a set of electrons
      *  that make up a semantic clash.  When it finds a semantic clash, it
      *  generates the resolvent and then uses a recursive technique called
      *  backtracking to find other sets of electrons that also make up
      *  a semantic clash.  This implementation is compact but quite
      *  slow.
      *
      *  Author:  A. J. Dos Reis
      *           Dept. of Mathematics and Computer Science
      *           College at New Paltz
      *           New Paltz, N.Y. 12561
      *--------------------------------------------------------------------
      *     CONSTANTS
      */
     #define TRUE 1
     #define FALSE 0
     #define MAXCLAUSES 100
     #define MAXVARS 10
     #define ELECTRON -1
     #define NUCLEUS 1
     /*--------------------------------------------------------------------
      *     GLOBAL VARIABLES
      */
     int clause[MAXCLAUSES][MAXVARS];     /* Holds clauses               */
     int clausetype[MAXCLAUSES];          /* Holds type:  nuc or elec    */
     int clashelec[MAXVARS];              /* Holds row numbers of electrons
                                             that form a semantic clash  */
     int nvars;                           /* Number of variables         */
     int avail;                           /* Index of next avail row     */
     int newclauses;                      /* TRUE if new resolvents
                                             generated on last pass      */
     int noemptyclause;                   /* TRUE if empty clause not
                                             generated                   */
     /*-------------------------------------------------------------------
      *     main prompts for and inputs the clauses.  It then calls
      *     findsemclash for each nucleus.  It continues until the
      *     empty clause is generated or until no new resolvents are
      *     generated.
      */
     main()
     {
        int nclauses;                     /* Number of clauses           */
        int nuc;                          /* Index of nucleus            */
        int i,j;                          /* Utility indices             */
        printf("enter number of clauses and variables\n");
        scanf("%d%d",&nclauses,&nvars);
        printf("enter clauses\n");
        for (i=0; i<nclauses; i++)        /* Read in clauses             */
           {clausetype[i]=ELECTRON;
            for (j=0; j<nvars; j++)
               {scanf("%d",&clause[i][j]);
                if (clause[i][j]==1)
                   clausetype[i]=NUCLEUS; /* Set clausetype accordingly  */
               }
           }
        avail=nclauses;                   /* Keep track of next avail row*/
        printf("initial clauses\n");      /* Print out initial clauses   */
        for (i=0; i<nclauses; i++)
           {for (j=0; j<nvars; j++)
              printf("%3d",clause[i][j]);
            printf("     (%1d)\n",i+1);
           }
        printf("resolvents\n");
        noemptyclause=TRUE;               /* Initialize flag             */
        do
           {newclauses=FALSE;             /* Initialize flag             */
            nuc=0;                        /* Start search from row 0     */
            do
               {if (clausetype[nuc]==NUCLEUS) /* Search for nucleus      */
                  {for (i=0; i<nvars; i++) /* Initialize clashelec       */
                      clashelec[i]=-99;
                   findsemclash(0,nuc);   /* Look for all sem clashes    */
                  }
                nuc++;                    /* Repeat for next clause      */
               } while ((nuc<nclauses) && noemptyclause);
           } while (newclauses && noemptyclause);
        if (noemptyclause)                /* Failed to gen empty clause? */
           printf("argument not valid\n");
        else
           printf("proof complete--empty clause generated\n");
     }
     /*-------------------------------------------------------------------
      *     findsemclash searches for an electron which under semantic
      *     resolution will eliminate the true literal in column truecol
      *     of the nucleus in row nuc.  When it finds a satisfactory
      *     electron, it recursively calls itself to find a electron that
      *     eliminates the next true literal.  When a set of electrons
      *     that forms a semantic clash is found, findsemclash generates
      *     the resolvent instead of making a recursive call.  Then
      *     it backtracks to find other sets that also form semantic clashes.
      */
     findsemclash(truecol,nuc)
     {int col;                            /* Column index                */
      int elec;                           /* Row number of electron      */
      int goodelec;                       /* True if elec ok under
                                             semantic resolution         */
      int i,j;                            /* Utility indices             */
      while (truecol<nvars)               /* Checked all columns of nuc? */
        if (clause[nuc][truecol]==1)      /* Is this a true literal      */
           {elec=0;                       /* Search for appropriate elec */
            do
               {if (clausetype[elec]==ELECTRON)
                   {col=0;                /* check col-by-col if elec ok */
                    goodelec=TRUE;
                    while ((col<nvars) && goodelec)
                       {if (              /* Satisfies sem resolution?   */
                            ( (col<truecol) && (clause[elec][col]!=0))
                                            ||
                            ( (clause[nuc][col] * clause[elec][col]==-1)
                                            !=
                              (truecol==col))
                           )
                           goodelec=FALSE; /* Electron fails test        */
                        col++;             /* Test next column           */
                       }
                    if (goodelec)         /* Electron passed test?       */
                       {clashelec[truecol]=elec; /* Remember row number  */
                        findsemclash(truecol+1,nuc); /* Recursive call   */
                       }
                   }
                 elec++;                  /* Try next electron           */
               } while ((elec<avail) && noemptyclause);
            return;                       /* Backtrack                   */
           }
        else
           truecol++;                     /* Continue search for true lit
                                              in the nucleus             */
      newclauses=TRUE;                    /* Get here only when a sem
                                              clash is found             */
      for (i=0; i<nvars; i++)             /* Copy nuc to new avail row   */
        clause[avail][i]=clause[nuc][i];
      for (i=0; i<nvars; i++)
        {elec=clashelec[i];               /* Get row number of electron  */
         if (elec!=-99)                   /* -99 means electron not needed
                                              for the ith col            */
            for (j=0; j<nvars; j++)       /* Generate the resolvent      */
               {clause[avail][j]+=clause[elec][j];
                if (clause[avail][j]!=0)
                   noemptyclause=TRUE;
                if (clause[avail][j]==-2)
                    clause[avail][j]=-1;
               }
        }
     noemptyclause=FALSE;                 /* Initialize flag            */
     for (j=0; j<nvars; j++)              /* Check for empty clause     */
        {printf("%3d",clause[avail][j]);
         if (clause[avail][j])
            noemptyclause=TRUE;           /* Reset if no empty clause   */
        }
     printf("     (%1d from %1d",avail+1,nuc+1); /* Print resolvent     */
     for (i=0;i<nvars; i++)
        if (clashelec[i]!=-99)            /* Print electron row numbers */
               printf(", %1d",clashelec[i]+1);
     printf(")\n");
     clausetype[avail]=ELECTRON;          /* Set type of resolvent      */
     avail++;                             /* Increment avail row index  */
     }