Paper ID: | 7015 |
---|---|

Title: | Accurate, reliable and fast robustness evaluation |

- Authors explained a gradient-based attack which can be used as a standard task to test the robustness of machine learning models. - The paper is well-written and problem statement is clear and concise. - There are major algorithmic and empirical contributions in this paper. - At each step, they solve a constrained quadratic program through an iterative gradient-based algorithm in order to find the most promising optimization step. After each step, the adversarial example Lp distance from the clean input example get smaller. - Authors approach, compare to previous work, only needs one hyper-parameter to tune (trust region) and as long as the boundary between the adversarial and the non-adversarial region can be described by a differentiable equality constraint, it is straightforward to extend it to other norms.

The method is inspired by previous ones, but seems to be original. The paper is well written. The paper sets out to obtain a new standard for adversarial attacks. This is an active area with many methods being proposed. Obtaining a general purpose method has been elusive. Even if the proposed method beats the state of the art, it could be quickly superseded by others. The general idea of following the boundary is interesting and further developments along these lines might take place

1. This seems to be straightforward extension of Brendel 2018. However, the novelty beyond that work seems to be quite minimal for a NeurIPS paper. Not sure if there is any attack performance improvement by adopting the gradient based strategy on top of Brendel 2018. 2. More importantly, the paper is not very clearly written and hence, rather difficult to understand. For example, the paper does not distinguish clearly between white-box and black-box attacks and therefore, it is difficult to understand the query efficiency aspect. The mathematical formulations often are not explained well, e.g., c used in Eq. 1 is explained much later. The experimental settings (e.g., hyperparameter descriptions) are also not explained in detail. I understand that there is shortage of space in the main paper, but at least more information should have been provided in the supplementary to help the readers. The same thing applies to the Budget aspect. As the notion of attack budget is not as straightforward as it is in the classical attack papers, apple-to-apple comparison here is a bit convoluted. The paper does not do a good job in explaining this aspect. Also, wondering how much budget (in the L \infty case) ends up getting used by the algorithm for different examples? A statistical estimate over a population of test examples could be useful. 3. One of the contributions that the authors listed in the abstract is the fact that their attacks are more reliable in the face of gradient masking, it will be interesting to understand (maybe geometrically) why this would be the case with the proposed approach. 4. Does the results in Table 1 and Table 2 include different \epsilon values as described in the text? I didn't really understand. 6. Being less sensitive to hyper parameter tuning is important, but as far as I could understand is that they see some benefit when one changes the learning rate significantly. I am not sure if this is a big deal in general as it is claimed in the paper. There are other elements such as initial random perturbations, number of optimization iterations etc. I would rather like to see some analysis on why the proposed algorithm (optimization problem) is fairly insensitive to learning rate variations. -------------------------------------------------------- I appreciate the authors' response, especially the response to "clarity and contributions" question well articulates the specific advancements beyond Brendel 2018. Based on the rebuttal, I am modifying my score to a 6. I encourage the authors to include the clarifications in the revised draft if the paper is accepted.