Boolean satisfiability (SAT), a nondeterministic polynomial-time (NP) complete problem, deals with determining whether a given set of Boolean propositions can evaluate to a true value, by choosing an appropriate assignment to their constituent variables. Usually, SAT solvers, which are used to solve such a problem, take an input formula in conjunctive normal form (CNF), where clauses having disjunctions of literals are logically banded together. The Davis-Putnam-Logemann-Loveland (DPLL) algorithm is the common underlying method in today’s SAT solvers.
In many applications, when a given formula is found to be unsatisfiable by the SAT solver, it is of interest to find whether a smaller subset of the initial clauses is unsatisfiable. To identify a core set of clauses, SAT solvers can write a trace on the disk, from which a proof of how the given set of initial clauses leads to a logical contradiction can be derived.
Asín Achá et al. describe the problem and the various algorithms in an interesting manner. For the benefit of the readers, this paper includes most of the relevant background material.
The paper begins with a good introduction to the earlier work done in this area, and it briefly compares the relative merits of in-memory methods that can avoid the expenses of trace-based methods for generating proofs and cores. Then, a short but very useful overview of the DPLL algorithms is given, followed by a review of an earlier method by Zhang and Malik [1].
The content in Section 3, “Zhang and Malik’s Trace-based Method,” could have been better organized. Since this section also contains extensions proposed by the authors, it should have been titled differently.
Section 4 describes four in-memory algorithms. These algorithms are further discussed in Sections 5 and 6, along with experimental results.
The first in-memory algorithm adds marker literals to the initial clauses. This method is apparently popular, but it cannot be used for generating resolution proofs. Based on an implementation done by the authors, the additional literals can result in higher memory consumption in some cases.
The second algorithm attaches a list of identifiers for the initial ancestors for each clause. Experimental results show that, while this algorithm can perform better than the first one, it can still be expensive. Perhaps, in the authors’ implementation, computing the union of parent lists could be a performance bottleneck.
The third algorithm, which was implemented in PicoSAT, keeps immediate parent lists with each clause. The difficulty here is in deleting learned clauses.
The fourth in-memory algorithm is a new one proposed by the authors. It addresses the issue encountered in the third algorithm by maintaining a counter of children having active descendants with each clause. The authors claim that this approach results in slower growth of memory usage.
All of the algorithms are described within the text. The authors did not choose a standard algorithmic notation. Unfortunately, the paper lacks a more formal description. However, overall, the paper is well written and is a pleasure to read.