Work

Methods for Derivative-Free Optimization with Applications in Machine Learning

Public

Derivative-free optimization (DFO) has received growing attention due to important problems arising in practice. Various research communities, ranging from machine learning to engineering design, have adopted distinct DFO methods. In this thesis, we present extensive studies as a meaningful step towards a comprehensive understanding of DFO methods. We study the relative merits of this disparate class of methods in a controlled setting, where the derivatives of the functions are unavailable and the functions values are subject to noise. In Chapter 2, we provide numerical investigations of finite-difference-based approaches against interpolation-based methods for solving unconstrained, (nonlinear) least-square and constrained optimization problems. This approach relies on finite differences to perform gradient approximations and applies existing gradient-based nonlinear optimization solvers. The numerical results demonstrate the effectiveness of finite-difference-based methods for both deterministic and noisy problems, when a proper implementation of finite differences is employed. The empirical results suggest that finite-difference-based methods are competitive with IBO methods, which are among the most reliable and efficient DFO solvers. We present in Chapter 3 an adaptive procedure to choosing a reliable finite differencing interval for noisy DFO problems, using a bisection approach. When noise is present in the objective function, the finite differencing interval needs to be chosen meticulously to balance the truncation error and noise. In particular, the optimal interval depends on the noise level of the function and a bound on the higher-order derivative. We show that the proposed adaptive procedure produces near-optimal differencing intervals without explicitly approximating higher-order derivatives. We demonstrate its robustness and accuracy on selected problems. In Chapter 4, we investigate general constrained DFO, where the constraints are analytically available. We propose to extend unconstrained interpolation-based optimization methods by combining them with a general-purpose nonlinear programming solver. This is a feasible method that constructs quadratic models of the objective and generates feasible iterates at every iteration. We discuss its effectiveness and limitations in this chapter. Our numerical results suggest that, when the objective is expensive to evaluate and the cost for evaluating constraints is negligible, this seemingly expensive approach is competitive with a state-of-the-art constrained DFO solvers that employs finite differences. Next, in Chapter 5 we consider a DFO problem that arises in Natural Language Processing. With the wide success of pre-trained large language models (LLMs) and their growing model size, it is imperative to consider strategies that efficiently adapt pre-trained general LLMs to individual language tasks. In particular, we consider prompt design, which makes use of different prompts to condition LLMs for different tasks. We propose to treat prompt design as a noisy DFO problem, wherein we search for a prompt that optimizes performance without accessing the parameters of the underlying LLM. We present competitive results for solving prompt design using DFO methods and identify the relative merits of NEWUOA, FD L-BFGS and CMA-ES. The observations in Chapter 5 motivate us to further analyze the relative strengths of NEWUOA and CMA-ES in Chapter 6. In particular, we consider a broader range of unconstrained problems from the CUTEst problem set. We observe that, contrary to the noiseless problems, while NEWUOA still demonstrates greater efficiency in achieving a desired level of accuracy, it is adversely affected by noise in function evaluations and consequently achieves lower accuracy compared to CMA-ES. Therefore, we investigate the impact of noise on IBO methods and briefly discuss strategies to improve the final accuracy of IBO methods in the presence of noise.

Creator
DOI
Subject
Language
Alternate Identifier
Date created
Resource type
Rights statement

Relationships

Items