NeurIPS 2019
Sun Dec 8th through Sat the 14th, 2019 at Vancouver Convention Center
Paper ID: 4578 An adaptive Mirror-Prox method for variational inequalities with singular operators

### Reviewer 1

The paper is well-written and is easy to follow, although prior knowledge of the notations and concepts are occasionally required. I did not carefully check the results, but they generally follow a reasonable line of argument.

### Reviewer 2

Among the positive aspects of this paper is that it' well written, the theoretical results are clearly explained and the related work review is fair and engaging. Among the negative aspects: 1. It's unclear what is the dual norm in (4.2b), since all previous definitions refer to the local norm || ||_x . If this is not a local norm, then the claims that this analysis extends to an arbitrary domain are void since this variance is typically unbounded for simple functions like quadratics. 2. Part of the contributions of this paper are shared with paper ID #4247, who uses the same framework based on local norms applied to the mirror descent algorithm. While there are differences between both settings, this significantly reduces the novelty of the analysis. Minor comments: 3. In Theorem 1, D is severely overloaded. It means both the arbitrary constant D as well as the Bregman distance, making for example the definition of C_D confusing. I encourage the authors to choose a different name for D. 4. In Eq. (4.7), it's unclear if the sum of \gamma_i^2 is in the numerator or denominator. Please use parenthesis to make this clear. 5. Below Eq. (5.2), is rho = theta? Otherwise I don't see any \rho in the formula. # Post-rebuttal I have read the author's rebuttal and the other reviewer's comments. The author have clarified the relationship with paper ID #4247 which was my main criticism. I have increased my score accordingly.

### Reviewer 3

This paper proposes a regularity condition together with an adaptive mirror-prox algorithm aiming to solve VI problem with possibly singular operators. They recover the optimal rate of $O(1/T)$ in the deterministic case by replacing Lipchitz continuity with the proposed one, both for MP and AMP. Also, they prove the $O(1/\sqrt(T))$ convergence rate of AMP in the stochastic case. The paper presents nice results but some are not surprising. Some issues in details: 1. The idea of Lipchitz-like condition is proposed by other works. Also, the Bregman parts are not novel. 2. The stochastic result $O(1/\sqrt(T))$ is basically mirror-prox under the proposed condition. The analysis of AMP covers the deterministic case only. So the statement Line 13-15 is kind of misleading. At the same time, the theory parts are not too strong.