Here’s a function: f(x). It’s expensive to calculate, not necessarily an analytic expression, and you don’t know its derivative.
Your task: find the global minima.
This is, for sure, a difficult task, one more difficult than other optimization problems within machine learning. Gradient descent, for one, has access to a function’s derivatives and takes advantage of mathematical shortcuts for faster expression evaluation.
Alternatively, in some optimization scenarios the function is cheap to evaluate. If we can get hundreds of results for variants of an input x in a few seconds, a simple grid search can be employed with good results.
Alternatively, an entire host of non-conventional non-gradient optimization methods can be used, like particle swarming or simulated annealing.
Unfortunately, the current task doesn’t have these luxuries. We are limited in our optimization by several fronts, notably:
#data-science #statistics #mathematics #artificial-intelligence #machine-learning