Paper ID: 1607
Title: Subsampled Power Iteration: a Unified Algorithm for Block Models and Planted CSP's
Current Reviews

Submitted by Assigned_Reviewer_1

Q1: Comments to author(s). First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. (For detailed reviewing guidelines, see http://nips.cc/PaperInformation/ReviewerInstructions)
This paper gives a new, simple algorithm for block models (especially when the graph is bipartite and the two sides are highly imbalanced), and show that the problem of planted CSP can be reduced to this case. As a result this paper gives simple "spectral"-like algorithms for planted CSPs for all k, where previously for odd k the only known results require SDP or more complicated spectral analysis.

This is a very interesting paper. The new algorithm is clean and efficient. The reviewer feels this gives significant better understanding to planted k-CSPs when k is odd. The main algorithmic technique is based on simple power iteration, but the subsampling idea is important (as otherwise it becomes the spectral approach which we know will not work in certain cases). The proof relies on careful induction on the mean and variance of the entries.

Minor points: Definition 3: In order for this to be useful there must also be a distance to r-wise independent (or in other words lowerbound in the fourier coefficient), would be nice to discuss how that appears in Theorem 2 below.

Section 5 explained why spectral algorithms cannot work, it would be nice to also explain why subsampling helps, as it still looks very much like power method that computes the top singular vectors. The induction proof does not give much intuition.

Overall I think this paper is very nice and should be accepted.
Q2: Please summarize your review in 1-2 sentences
This is a very interesting paper that gives clean, efficient algorithms for stochastic block models and especially planted CSPs.

Submitted by Assigned_Reviewer_2

Q1: Comments to author(s). First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. (For detailed reviewing guidelines, see http://nips.cc/PaperInformation/ReviewerInstructions)
This paper is novel and solid. I do not have any further concerns on the quality of this paper.
Q2: Please summarize your review in 1-2 sentences
This paper proposes an efficient power method for the stochastic block model and planted constraint satisfaction problems. The writing is very clear and the technical proofs are sound. I would like to suggest "accept" for this paper.

Submitted by Assigned_Reviewer_3

Q1: Comments to author(s). First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. (For detailed reviewing guidelines, see http://nips.cc/PaperInformation/ReviewerInstructions)
This paper provides an algorithm for recovering the partitions in the stochastic block model and planted constraint satisfaction problems (CSP). The main result is for the bipartite version of stochastic block model, and the k-CSP is reduced to the block model. The statistical and computational properties of the algorithm are discussed and analyzed in the paper.

I believe the presentation of the paper can be improved. For someone who is not familiar with these models, it is not easy to understand all the details. It is useful to add some figures and visual representations of these models.

Typo: Second line from bottom of page 3: requires --> require
Q2: Please summarize your review in 1-2 sentences
The paper has a reasonable amount of contribution, but the presentation can be improved. For instance, some of the figures which are provided in the long version can be added to NIPS draft to make it easier for the reader to understand the models.

Submitted by Assigned_Reviewer_4

Q1: Comments to author(s). First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. (For detailed reviewing guidelines, see http://nips.cc/PaperInformation/ReviewerInstructions)
This paper proposes a resampled power iteration algorithm that is composed of four phases and only operates on random bipartite graphs in order to recover planted solutions of stochastic block model. In addition, the k-CSP problem can be reduced to this problem and thus solved by the same approach. Comparing to SDP based method, the proposed one is faster and only requires low density of edges/constraint.

The theoretical part of this paper is solid and the results improves previous methods a lots. However, the time complexity issue need to be verified by simulation or experiments on real applications, which cannot be found in the paper.
Q2: Please summarize your review in 1-2 sentences
The proposed resampled power iteration algorithm recovers planted solutions of stochastic block model by a bipartite generalization, and reduces the planted k-CSP to the same problem. The edge/constraint density required for complete recovery is the best-known result. The theoretical part is solid. But no simulations or experiments on real applications have been given in the paper

Author Feedback
Author Feedback
Q1:Author rebuttal: Please respond to any concerns raised in the reviews. There are no constraints on how you want to argue your case, except for the fact that your text should be limited to a maximum of 5000 characters. Note however, that reviewers and area chairs are busy and may not read long vague rebuttals. It is in your own interest to be concise and to the point.
1. Figures/explanation of model
Thanks for the suggestion. We will add a figure and improve the explanation of the model.

2. How general is the algorithm? (e.g., more than 2 blocks?)
First, via the reduction described, the algorithm can be used to recover a planted assignment for any planted k-SAT/k-CSP problem up to the current best thresholds.
Second, for any number of blocks, one could run an iteration similar to the subspace power iteration (which simultaneously finds multiple eigenvectors), but with subsampling; however, we have not analyzed the performance of this variant.

3. Why/how is it better than SVD?
Our algorithm is provably better than SVD. In follow-up work, Florescu and Perkins [2015] have shown that the threshold for SVD to work is actually higher, namely it needs (n_1)^1/3(n_2)^(2/3) edges in place of \sqrt(n_1 n_2) needed by our algorithm (n_2 is the size of the larger block). In other words, SVD needs a factor of (n_2/n_1)^(1/6)asymptotically more edges before it is guaranteed to succeed, where as our algorithm nearly matches the information-theoretic lower bound. Thus the main result of our paper is this provable asymptotic improvement over the SVD; we will however add simulation results to compare the performance of the two algorithms on instances of different sizes. The intuition for why subsampling helps is that at low densities, the singular vectors of the rectangular adjacency matrix localize to a small number of coordinates (vertices of high degrees); in the subsampled matrices the corresponding vertices of high degrees are different for each matrix, and so the vectors do not localize.

4. Distance to r-wise independence. We can in fact derive an explicit formula for the parameter delta in the block model in terms of the magnitude of the Fourier coefficient from definition 3; this is a good suggestion and we will be explicit about the dependence.