"Non-convex" is really vague: Really that means essentially just everything else...

jpfr · on Jan 24, 2018

Can you explain where Kuhn-Tucker enters the picture here? Non-convex optimization problems do not necessarily define constraints for the domain.

Most non-convex optimization with bounds for the convergence speed exploits on some sort of structure. For example Lipschitz bounds for the gradient. There is some recent work that requires only a semi-metric around the optimizer. That opens up a large class of functions that could not be efficiently optimized before.

http://papers.nips.cc/paper/4304-optimistic-optimization-of-...

graycat · on Jan 24, 2018

> Can you explain where Kuhn-Tucker enters the picture here?

If there are no constraints, then, sure, the Kuhn-Tucker conditions do not have a non-trivial role. In that case, commonly we would state up front that we were considering unconstrained optimization.

Sure, such a Lipschitz bound could give us some help on lower bounds on the objective function value but for those bounds we would need some constraints! Then at least in principle, we are back to the Kuhn-Tucker conditions again.

I suspect that the paper is (A) not about all of non-convex optimization but (B) about some of optimization, maybe some part of image recognition, where the objective functions are non-convex. So, we're talking a subtle language issue. Or the paper is about some parts of non-convex optimization!!!

jpfr · on Jan 28, 2018

> Sure, such a Lipschitz bound could give us some help on lower bounds on the objective function value but for those bounds we would need some constraints! Then at least in principle, we are back to the Kuhn-Tucker conditions again.

You don't need constraints for the bounds. That's the beauty of it.

The idea is to sample from the domain and use Lipschitz bounds (or a semi-metric in the cited paper) to upper-bound the target function via the distance to the samples. The question is how take samples in a principled way that garuantees a "good" speed of convergence.

There is no free lunch. So some assumptions have to be made on the non-convex target function. However, a semi-metric around the optimizer is much weaker assumption than a Lipschitz constant. The function doesn't even have to be continuous!

Of course performance is a lot slower than, say, linear programming. But this is non-convex optimization, a much harder problem.

graycat · on Feb 7, 2018

Without some constraints, a Lipschitz bound with non-zero gradient is no bound at all.