Reviews: A Multi-step Inertial Forward-Backward Splitting Method for Non-convex Optimization

NIPS 2016
Mon Dec 5th through Sun the 11th, 2016 at Centre Convencions Internacional Barcelona

Paper ID:	2019
Title:	A Multi-step Inertial Forward-Backward Splitting Method for Non-convex Optimization

Reviewer 1

Summary

This paper focuses on the problem of minimising the sum of two (non-convex) functions where one of them is differentiable with $L$-Lipschitz continuous gradient. The authors propose a multi-step inertial forward-backward splitting method and investigate the global convergence as well as the local linear convergence. More precisely, when the objective function is KL and the sequence generated by MiFB is bounded, then under suitable choice of the parameters, the sequence generated by MiFB converges to a critical point $x^*$. Where the non-smooth function is PSF at $x^*$ relative to $C^2$-submanifold $M_{x^*}$, and the smooth function is $C^2$ around $x^*$ and the non-degeneracy condition satisfied, then under suitable condition on parameter, the sequence generated by MiFB converges linearly. Numerical experiment and comparison are provided.

Qualitative Assessment

The proposed method is a variant of FBS method. Boundedness assumption of the generated sequence is strong. Moreover, a common drawback for non-convex case is that we do not know the generated sequence converges to the global minimiser or not.

Confidence in this Review

2-Confident (read it all; understood it all reasonably well)

Reviewer 2

Summary

In this paper, a multi-step inertial forward-backward method is proposed to minimise an objective function that can be written as the sum of a smooth (possibly non convex) function and a proper l.s.c. function. With respect to most common inertial schemes, the authors enlarge the possible choices for inertial parameters, allowing for combinations (also negative) of more than two previous iterates. Convergence to a critical point for arbitrary initialisation assuming a priori boundedness of the iterate is proved under KL property. Local linear convergence is established under partial smoothness. The results extend significantly what is known in the convex setting. The experimental results compare multi-step IFB with classical FB.

Qualitative Assessment

The paper provides new insight into the theory of inertial methods. The authors prove convergence to a critical point for non convex and non smooth objective functions. Theoretical analysis: The theoretical results extend known results in two directions: to more general objective functions and to more general inertial schemes. The more interesting aspect is the generalisation to non convex problems. However, I am not sure of the correctness of the algorithm under the assumptions made in the paper. Indeed, without additional conditions (e.g. boundedness from below) on the function R, the proximity operator can be empty, and in this case the iteration is not well-defined. One of the main results, Theorem 2.2 establishes convergence of the iterates, assuming boundedness of the iterates. I think that a more deep analysis of the class of allowed inertial parameters would be useful. For instance, are classical choices included in the considered class (e.g. the FISTA ones)? The local linear convergence result extend what was known in the convex setting to the non convex one. Experiments: The experimental section is interesting, but to me the gains of using s\geq2 are limited. Moreover, in the comparison with FB, why is the step-size so small? FB should work with gamma\le 1/L. Does a choice of s\geq 1 require smaller step-sizes? In view of the previous remarks, I think that the paper presents an interesting theoretical analysis, but the practical impact of the generalisation studied in the paper is not clear to me. In particular, the choice of several additional parameters with respect to standard inertial schemes is required, and the advantages shown in the experiments look limited.

Confidence in this Review

2-Confident (read it all; understood it all reasonably well)

Reviewer 3

Summary

This paper proposes a multistep inertial forward backward splitting algorithm to solve composite nonconvex minimization problem. The paper was motivated by several applications, which are essentially based on either l-0 regularizer or rank regularizer. The authors present the algorithm and show its global convergence to a critical point under the well-known KL condition. They also prove a local linear convergence, which requires some technical conditions. A couple of numerical examples are presented to illustrate the theoretical bounds as well as comparison between different settings.

Qualitative Assessment

In my opinion, this paper has a natural extension of existing heavy ball methods in the literature but their theoretical analysis seems to be new although the proof technique is essentially based on the well-known KL condition. The local convergence result is also new. However, the algorithm has several parameters, which are practically hard to set automatically for users. In addition, the performance of different values of s does not make a major difference, which may not motivate the user to use this algorithm. The technical proofs are very long, which I unfortunately cannot be able to check in detail.

Confidence in this Review

2-Confident (read it all; understood it all reasonably well)

Reviewer 4

Summary

This paper investigates the problem of minimizing the sum of two non-necessarily convex functions. The authors proposed a multi-step inertial Forward-Backward splitting algorithm and prove its global convergence.

Qualitative Assessment

I have a few comments. 1. The authors proved the convergence of the sequence generated by the proposed algorithm to a critical point. However, the critical point is just a local optimizer, not a global one. Can the authors explain the global convergence in more formal way? 2. In the numerical experiment section, the authors only consider one run for each model setting. It should replicate multiple times (for example 100 times) to evaluate the performance of the proposed algorithm. Furthermore, the L0 or the rank function could introduce instability in the optimization problem. The results presented only reflect one specific example.

Confidence in this Review

2-Confident (read it all; understood it all reasonably well)