NIPS 2016
Mon Dec 5th through Sun the 11th, 2016 at Centre Convencions Internacional Barcelona
Paper ID: 637 Solving Marginal MAP Problems with NP Oracles and Parity Constraints

### Reviewer 1

#### Summary

A method for solving marginal MAP problems with NP Oracles and parity constraints is introduced. The method represents the intractable counting subproblem with queries to NP oracles subject to additional parity constraints.

#### Qualitative Assessment

The motivation for this work is Ermon et al.’s method of transforming a counting problem into a series of optimization problems with each corresponding to the original problem subject to randomly generated XOR constraints. In comparison, the proposed algorithm XOR_MAP approximates the intractable sum with a series of optimization problems, which are then folded into a global optimization task of polynomial size of the original MMAP problem. The new algorithm is shown to be able to provide a constant factor approximation to the marginal MAP problem as well as upper and lower bounds on the final result. A strength of the paper is its experimental evaluation. The evaluation includes not only benchmark datasets such as SAT instance and MRF models but also real applications such as computer vision, computational sustainability, and crowdsourcing.

#### Confidence in this Review

3-Expert (read the paper in detail, know the area, quite certain of my opinion)

### Reviewer 2

#### Summary

The authors study the marginal MAP problem in binary graphical models, i.e. where you may need to maximize some variables while marginalizing over others. This problem is intractable in general. Inference has recently been addressed exactly by model counting and in an approximate manner by randomized algorithms wrt a NP-oracle (SAT), and this paper continues this line of work for MMAP.

#### Qualitative Assessment

In terms of novelty, the paper's discussion on leveraging the analysis of hashing-based counting approaches was good, although it would have been nice to get a clear sense of how the claims in MMAP differ from the standard case of doing inference. In terms of scholarship, Ermon's approach is based on doing polynomially many MAP queries, and I was not able to see from the paper whether the current proposal avoid this.

#### Confidence in this Review

2-Confident (read it all; understood it all reasonably well)

### Reviewer 3

#### Summary

This paper aims at approximating marginal MAP (MMAP) problem instances using NP oracles. In the spirit of Ermon et. al.’s work (ICML 2013) for approximating #P-complete problems using pairwise independent hashing and optimization, the authors propose a randomized algorithm (XOR-MMAP) that transforms an MMAP query into a series of joint MAP (or MPE) queries, each begin defined as a nonlinear optimization problem over all variables, subject to randomly generated parity constraints. From a theoretical viewpoint, approximation guarantees for XOR-MMAP are provided in both the unweighted (or binary) case and the weighted case. Comparative experiments, with respect to two state-of-the-art MMAP approximation methods are performed on various domains, including random SAT instances, an image completion task, and a network design application.

#### Qualitative Assessment

There are pros and cons in this article. On the one hand, the idea of approximating complex MMAP queries using a series of random constrained optimization problems is conceptually simple and opens the door to many potential improvements. Furthermore, I really appreciated the diversity of experimental domains used to evaluate the method. On the other hand, I have several concerns about the approximation bounds and the optimization oracle. * Approximation bounds The approximation guarantees provided by Theorem 3.2. and Theorem 3.6 are very weak. For the unweighted case, we have a bound of $4^c$ and, to ensure that the denominator $\alpha(c)$ of the complexity parameter T is greater or equal than 1, $c$ must be greater or equal than 5, which roughly means an approximation constant about one thousand. For the weighted case, the constant is about three thousands. With such large constants, one may wonder whether the approximation guarantees of the algorithm are useful in practice. The extension to the weighted case (Section 3.2) could be improved using a recent article by Chakraborty et al. (IJCAI 15) (From Weighted to Unweighted Model Counting). Indeed, since the weight function is encoded as a literal-weighted CNF (as specified by Lines 358-362 in the S.M.), it can be transformed into an equivalent unweighted (binary) CNF formula of polynomial size. By applying Theorem 2 of Chakraborty and Theorem 5.2, it seems that we can get a better bound for the weighted case of MMAP. In Lines 178-180, the authors suggest to use binary search instead of linear search for the number of calls to the optimization oracle. This is clearly a preferable strategy, but I am wondering whether the approximation bounds are still preserved in this case. In fact, the proof of Theorem 3.2 (Lines 173-177) needs $n$ calls to the oracle in order to derive a probability of $1 - \delta$. By the way, XOR-MMAP+ provided in the S.M. looks more interesting for reducing the complexity parameter $T$, but the number of calls to the oracle is in $O(n log n)$. So, can you obtain similar approximation bounds when XOR-MMAP(+) calls XOR-K only $O(log n)$ times? * Optimization oracle The XOR-MMAP algorithm relies on an optimization oracle which is far from easy to solve. Namely, Eq. (2) of Algorithm 2 is a nonlinear constrained optimization task, where the constraint set is a list of $kT$ random XOR clauses (Lines 87-88), each of size $O(n)$. Based on the standard encoding suggested in the supplementary material, the objective function can be specified as a linear function. But the overall set of constraints for this linear optimization problem involves both XOR clauses (encoding the hash function), and CNF clauses (encoding the weight function). As we know, XOR clauses over $n$ variables cannot be translated into CNF clauses (this would give an exponential number of disjuncts). So, which algorithm do you use to solve this linear optimization problem involving both CNF and XOR clauses? Unless I missed something, there is no discussion in the paper about the practical implementation of this expensive optimization oracle. In light of the above remark, I think that it is crucial to report the running times of all algorithms in the experimental study. Even if XOR-MMAP behaves well in practice for approximating MMAP instances, its optimization oracle is very expensive. So one might really wonder whether XOR-MMAP is scalable, in comparison with SAA and Mixed LBP. * Minor comments Line 182: From Theorem 3.2 -> From Theorem 3.1 The use of “NP oracles” in the title is not - strictly speaking - appropriate. The procedure used by XOR-MMAP is an FP^NP (i.e. constrained optimization) oracle.

#### Confidence in this Review

2-Confident (read it all; understood it all reasonably well)

### Reviewer 4

#### Summary

The authors proposed an algorithm named XOR_MMAP to solve the Marginal MAP Problem. In the proposed algorithm, the intractable counting subproblem are solved via a series of quries to NP oracles, subject to additional parity constraints. Experiments show that the proposed algorithms outperforms the Sample Average Approximation (SAA) algorithm and the Mixed Loopy Belief Propagation (Mixed LBP) algorithm.

#### Qualitative Assessment

The presentation is mostly clear. However, my main concerns is on how to solve the maximization (optimization) problem. It is true that algorithm the proposed Algorithm XOR_MMAP s able to provide a constant factor approximation to the Marginal MAP Problem. However, due to NP-hardness of optimization problen in Eq. (2), it might be impossible to solve Eq. (2) exactly. Fortunately, for several particular class of problems like Eq. (2), epsilon-optimal solutions can be found for Eq. (2) in polynomial time. What will happen if we are only able to provide an epsilon-optimal solutions for Eq. (2)? Also in the experiments, the authors claimed that they are using ACE to solve the counting problem. However, it is unclear how to solve the optimization problem. Minor: In figure 1, why is upper bound larger than lower bound? ======After Rebuttal=========== It is still unclear how the authors solve each optimisation problem with a fixed k.

#### Confidence in this Review

2-Confident (read it all; understood it all reasonably well)

### Reviewer 5

#### Summary

This paper explains a new approach for approximating MMAP. Is based on an approach for approximating SAT, but is modified to applied to MMAP.

#### Qualitative Assessment

I think this paper was really good, and I have nothing to criticize. Side note: In line 32, I think instead of massage it should say message.

#### Confidence in this Review

1-Less confident (might not have understood significant parts)

### Reviewer 6

#### Summary

The paper presents a new approach to approximate Marginal MAP inference using hashing and parity constraints. It is a direct extension of a recent hashing based scheme for approximating (weighted) counting problems. The experimental evaluation compares the new method against two baselines based on sampling (SAA) and belief propagation (MixedBP), and shows promising results on several unweighted and weighted MMAP benchmarks.

#### Qualitative Assessment

Although the method presented seems to perform well in practice, I have a few concerns regarding the empirical evaluation and quality of presentation. 1. I would like to see how it performs against an exact MMAP method. The first two benchmarks (2-SAT, MRFs) look pretty easy and can be solved exactly, e.g. using some of the most recent MMAP solvers based on search (see references below). This is important to better assess the solution quality as well as the running time of the proposed method. 2. There is significant previous work on approximating MMAP, especially in the graphical models community (eg. work by Darwiche at al, Maua and de Campos,Druzdzel et al, Ihler at al). For some reason, this is ignored, or maybe just overlooked. It is important to discuss/relate/compare with these existing approaches. 3. The MixedBP baseline chosen for evaluation is clearly representing the class of variational methods, but may not be the best practical choice. For example, there is the more recent Weighted Mini-Bucket (WMB(i)), which is also a variational method and which performs much better than BP in practice (see also work by Ihler and Liu). 4. I think the presentation of the XOR_MMAP section can be improved more. For instance, it isn't clear how one solves Equation (2) in Algorithm 2. Furthermore, since the algorithm depends on a set of parameters (T, k, c), I think a more detailed discussion is required in order to better understand (a) how to tune them in practice, and (b) their effect on performance. Marinescu, Dechter, Ihler. AND/OR Search for Marginal MAP. UAI-14. Marinescu, Dechter, Ihler. Pushing forward marginal MAP with best-first search. IJCAI-15. Lee, Marinescu, Dechter, Ihler. From exact to anytime MMAP solutions. AAAI-16.

#### Confidence in this Review

3-Expert (read the paper in detail, know the area, quite certain of my opinion)