Generalizing satisfiability problems
In propositional logic, we used hard constraints to specify our system. Today, we will consider several generalizations of propositional logic and the Boolean satisfiability problem that allow us to formulate more complex problems. When considering the problems that can be encoded in a given framework, we will see we have the following spectrum of representational power, from least to greatest: SAT Weighted MAX-SAT Markov logic networks Factor graphs.
Minimum Vertex Cover (Weighted MAX-SAT)
Given an undirected graph , we want to find a subset of vertices such that for each , either or (every edge is incident to one of the vertices in ). The minimum vertex cover problem is finding a smallest vertex cover. A trivial vertex cover would be the set of all vertices. A real-life example is a graph where the edges represent roads and finding a vertex cover could represent installing a small number of cameras at the intersections which cover all of the roads.
Consider formulating this problem the following way. We introduce one Boolean variable for each vertex, such that iff . Then for each , we introduce the clause . We want to minimize . The hard requirement is finding a valid vertex cover while the soft requirement is minimizing the size of the vertex cover.
An extension of SAT is Weighted MAX-SAT where we add weights to each clause. For any assignment , define to be the sum of the weights of the clauses that satisfies and to be the sum of the weights of the clauses that does not satisfy. Then the optimization problem is finding the assignment with maximum score or minimum cost.
To formulate the minimum vertex cover problem as a Weighted MAX-SAT problem, we find the assignment that minimizes the cost of the following optimization problem: for each , with weight and for each , with weight 1.
To solve Weighted MAX-SAT, we can adapt solvers for SAT. Local search algorithms and greedy heuristics can use the cost or score function when deciding between possible moves. During systematic search, when exploring the search tree, solvers can use branch and bound techniques when deciding which subtrees to prune. This involves keeping track of lower bounds or upper bounds based on the best solution found so far and guesses for bounds in the current subtree for either the cost or score function. To get the bounds for the subtrees, one technique is relaxing the domain of an integer optimization problem from to and solving the easier optimization problem.
Markov logic networks
Propositional logic fails when some statements in the knowledge base are inconsistent. Instead of having logical statements which are either true or false in a world, we want to generalize to statements which hold with some weight (probability). Markov logic networks are such a generalization, combining logic and probability.
The idea is to encode our problem using soft constraints among the variables in our model. By specifying weights on the soft constraints, we will then have a compact way to represent the most likely worlds consistent with our knowledge.
For any configuration (state), defined by an assignment of values to the variables in our model, we can compute the weight of that configuration. Formally, we define a cost function, mapping each configuration , and we let the weight of a configuration be equal to .
Define where is an indicator function which is 1 if clause is violated. By defining the normalization constant , we ensure we have valid probabilities satisfying and .
We see that the probability of an assignment factorizes over the clauses:
for an appropriately defined , one factor for each clause .
In Markov logic networks, a basic optimization problem is finding the configuration satisfying for some evidence . Note that is equivalent to .
Factor graphs
An arbitrary probability distribution over binary variables requires parameters to represent. Factor graphs allow us to study probability distributions which factorize and can be represented in a compact way.
The basic setup for factor graphs is for each , let over all cliques where each is a function which maps .
We can encode satisfiability problems into the factor graph representation as follows. Given a CNF formula, we can represent each clause by a clique which is a function of the variables in that clause. If the clause is satisfied by some partial assignment , we let , and otherwise we let . Then an assignment that satisfies all of the clauses will have and otherwise will have .
In factor graphs, some probabilistic reasoning tasks that we might want to perform are as follows:
- Marginal and conditional probabilities:
- Sampling:
- Compute the partition function (the number of satisfying assignments in SAT problems)
- Most likely state of the world given evidence:
Theorem: Problems 1 through 3 are reducible to one another.
Proof (): If we can compute partition functions, then we can compute by first computing the partition function of the original graphical model, clamping the variable , and then counting the number of satisfying assignments with . The marginal probability is the ratio of the total weight in the subtree rooted at and the total weight of the tree.
Proof (): If we can compute all marginal probabilities, then we can compute the partition function as follows. Rewrite where we choose the assignments so that the partition function counts are non-zero and the ratios above are inverse ratios of marginal probabilities.
Proof (): If we can draw samples , then we can compute the marginal probability as the empirical frequencies for sufficiently large.
Proof (): If we can evaluate marginal probabilities, then we can sample. Start at the root of the tree. Say this is variable . We compute the marginal probability that is true or false. We draw a random number and go down the branch if and otherwise. This works because of the chain rule since