Optimization
Convex Analysis 豆瓣
作者: Ralph Tyrell Rockafellar Princeton University Press 1996
Available for the first time in paperback, R. Tyrrell Rockafellar's classic study presents readers with a coherent branch of nonlinear mathematical analysis that is especially suited to the study of optimization problems. Rockafellar's theory differs from classical analysis in that differentiability assumptions are replaced by convexity assumptions. The topics treated in this volume include: systems of inequalities, the minimum or maximum of a convex function over a convex set, Lagrange multipliers, minimax theorems and duality, as well as basic results about the structure of convex sets and the continuity and differentiability of convex functions and saddle-functions. This book has firmly established a new and vital area not only for pure mathematics but also for applications to economics and engineering. A sound knowledge of linear algebra and introductory real analysis should provide readers with sufficient background for this book. There is also a guide for the reader who may be using the book as an introduction, indicating which parts are essential and which may be skipped on a first reading.
Non-convex Optimization for Machine Learning 豆瓣
作者: Prateek Jain / Purushottam Kar Now Publishers Inc 2017
Prateek Jain and Purushottam Kar (2017), "Non-convex Optimization for Machine Learning", Foundations and Trends® in Machine Learning: Vol. 10: No. 3-4, pp 142-363. http://dx.doi.org/10.1561/2200000058
https://www.nowpublishers.com/article/Details/MAL-058
https://www.prateekjain.org/publications/all_papers/JainK17_FTML.pdf
Non-convex Optimization for Machine Learning takes an in-depth look at the basics of non-convex optimization with applications to machine learning. It introduces the rich literature in this area, as well as equipping the reader with the tools and techniques needed to analyze these simple procedures for non-convex problems.
Non-convex Optimization for Machine Learning is as self-contained as possible while not losing focus of the main topic of non-convex optimization techniques. Entire chapters are devoted to present a tutorial-like treatment of basic concepts in convex analysis and optimization, as well as their non-convex counterparts. As such, this monograph can be used for a semester-length course on the basics of non-convex optimization with applications to machine learning. On the other hand, it is also possible to cherry pick individual portions, such the chapter on sparse recovery, or the EM algorithm, for inclusion in a broader course. Several courses such as those in machine learning, optimization, and signal processing may benefit from the inclusion of such topics.
Non-convex Optimization for Machine Learning concludes with a look at four interesting applications in the areas of machine learning and signal processing and explores how the non-convex optimization techniques introduced earlier can be used to solve these problems.
First-Order Methods in Optimization 豆瓣
作者: Amir Beck SIAM-Society for Industrial and Applied Mathematics 2017 - 10
The primary goal of this book is to provide a self-contained, comprehensive study of the main first-order methods that are frequently used in solving large-scale problems. First-order methods exploit information on values and gradients/subgradients (but not Hessians) of the functions composing the model under consideration. With the increase in the number of applications that can be modeled as large or even huge-scale optimization problems, there has been a revived interest in using simple methods that require low iteration cost as well as low memory storage.
The author has gathered, reorganized, and synthesized (in a unified manner) many results that are currently scattered throughout the literature, many of which cannot be typically found in optimization books.
First-Order Methods in Optimization offers comprehensive study of first-order methods with the theoretical foundations; provides plentiful examples and illustrations; emphasizes rates of convergence and complexity analysis of the main first-order methods used to solve large-scale problems; and covers both variables and functional decomposition methods.
Audience: This book is intended primarily for researchers and graduate students in mathematics, computer sciences, and electrical and other engineering departments. Readers with a background in advanced calculus and linear algebra, as well as prior knowledge in the fundamentals of optimization (some convex analysis, optimality conditions, and duality), will be best prepared for the material.
Numerical Optimization 豆瓣
作者: Jorge Nocedal / Stephen Wright Springer 2006 - 7
Optimization is an important tool used in decision science and for the analysis of physical systems used in engineering. One can trace its roots to the Calculus of Variations and the work of Euler and Lagrange. This natural and reasonable approach to mathematical programming covers numerical methods for finite-dimensional optimization problems. It begins with very simple ideas progressing through more complicated concepts, concentrating on methods for both unconstrained and constrained optimization.
Partitional Clustering via Nonsmooth Optimization 豆瓣
作者: Adil M. Bagirov / Napsu Karmitsa Springer
This book describes optimization models of clustering problems and clustering algorithms based on optimization techniques, including their implementation, evaluation, and applications. The book gives a comprehensive and detailed description of optimization approaches for solving clustering problems; the authors' emphasis on clustering algorithms is based on deterministic methods of optimization. The book also includes results on real-time clustering algorithms based on optimization techniques, addresses implementation issues of these clustering algorithms, and discusses new challenges arising from big data. The book is ideal for anyone teaching or learning clustering algorithms. It provides an accessible introduction to the field and it is well suited for practitioners already familiar with the basics of optimization.
High-Dimensional Probability 豆瓣
作者: Roman Vershynin Cambridge University Press 2018 - 9
High-dimensional probability offers insight into the behavior of random vectors, random matrices, random subspaces, and objects used to quantify uncertainty in high dimensions. Drawing on ideas from probability, analysis, and geometry, it lends itself to applications in mathematics, statistics, theoretical computer science, signal processing, optimization, and more. It is the first to integrate theory, key tools, and modern applications of high-dimensional probability. Concentration inequalities form the core, and it covers both classical results such as Hoeffding's and Chernoff's inequalities and modern developments such as the matrix Bernstein's inequality. It then introduces the powerful methods based on stochastic processes, including such tools as Slepian's, Sudakov's, and Dudley's inequalities, as well as generic chaining and bounds based on VC dimension. A broad range of illustrations is embedded throughout, including classical and modern results for covariance estimation, clustering, networks, semidefinite programming, coding, dimension reduction, matrix completion, machine learning, compressed sensing, and sparse regression.
Optimization Algorithms on Matrix Manifolds 豆瓣
作者: Absil, P. a./ Mahony, Robert/ Sepulchre, Rodolphe Princeton Univ Pr 2007
Many problems in the sciences and engineering can be rephrased as optimization problems on matrix search spaces endowed with a so-called manifold structure. This book shows how to exploit the special structure of such problems to develop efficient numerical algorithms. It places careful emphasis on both the numerical formulation of the algorithm and its differential geometric abstraction - illustrating how good algorithms draw equally from the insights of differential geometry, optimization, and numerical analysis. Two more theoretical chapters provide readers with the background in differential geometry necessary to algorithmic development. In the other chapters, several well-known optimization methods such as steepest descent and conjugate gradients are generalized to abstract manifolds. The book provides a generic development of each of these methods, building upon the material of the geometric chapters. It then guides readers through the calculations that turn these geometrically formulated methods into concrete numerical algorithms. The state-of-the-art algorithms given as examples are competitive with the best existing algorithms for a selection of eigenspace problems in numerical linear algebra. "Optimization Algorithms on Matrix Manifolds" offers techniques with broad applications in linear algebra, signal processing, data mining, computer vision, and statistical analysis. It can serve as a graduate-level textbook and will be of interest to applied mathematicians, engineers, and computer scientists.
Distributed Optimization and Statistical Learning Via the Alternating Direction Method of Multipliers 豆瓣
作者: Stephen Boyd / Neal Parikh Now Publishers Inc 2011
https://web.stanford.edu/~boyd/papers/admm_distr_stats.html
Many problems of recent interest in statistics and machine learning can be posed in the framework of convex optimization. Due to the explosion in size and complexity of modern datasets, it is increasingly important to be able to solve problems with a very large number of features or training examples. As a result, both the decentralized collection or storage of these datasets as well as accompanying distributed solution methods are either necessary or at least highly desirable. In this review, we argue that the alternating direction method of multipliers is well suited to distributed convex optimization, and in particular to large-scale problems arising in statistics, machine learning, and related areas. The method was developed in the 1970s, with roots in the 1950s, and is equivalent or closely related to many other algorithms, such as dual decomposition, the method of multipliers, Douglas–Rachford splitting, Spingarn's method of partial inverses, Dykstra's alternating projections, Bregman iterative algorithms for ℓ1 problems, proximal methods, and others. After briefly surveying the theory and history of the algorithm, we discuss applications to a wide variety of statistical and machine learning problems of recent interest, including the lasso, sparse logistic regression, basis pursuit, covariance selection, support vector machines, and many others. We also discuss general distributed optimization, extensions to the nonconvex setting, and efficient implementation, including some details on distributed MPI and Hadoop Map Reduce implementations.
Computational Optimal Transport 豆瓣
作者: Gabriel Peyré / Marco Cuturi Now Publishers Inc 2019 - 5
https://optimaltransport.github.io/
https://www.nowpublishers.com/article/Details/MAL-073
The goal of Optimal Transport (OT) is to define geometric tools that are useful to compare probability distributions. Their use dates back to 1781.
Recent years have witnessed a new revolution in the spread of OT, thanks to the emergence of approximate solvers that can scale to sizes and dimensions that are relevant to data sciences. Thanks to this newfound scalability, OT is being increasingly used to unlock various problems in imaging sciences (such as color or texture processing), computer vision and graphics (for shape manipulation) or machine learning (for regression, classification and density fitting). This monograph reviews OT with a bias toward numerical methods and their applications in data sciences, and sheds lights on the theoretical properties of OT that make it particularly useful for some of these applications.
Computational Optimal Transport presents an overview of the main theoretical insights that support the practical effectiveness of OT before explaining how to turn these insights into fast computational schemes. Written for readers at all levels, the authors provide descriptions of foundational theory at two-levels. Generally accessible to all readers, more advanced readers can read the specially identified more general mathematical expositions of optimal transport tailored for discrete measures. Furthermore, several chapters deal with the interplay between continuous and discrete measures, and are thus targeting a more mathematically-inclined audience.
This monograph will be a valuable reference for researchers and students wishing to get a thorough understanding of Computational Optimal Transport, a mathematical gem at the interface of probability, analysis and optimization.
Evolutionary Learning: Advances in Theories and Algorithms 豆瓣
演化学习:理论与算法进展
作者: Zhou, Zhi-Hua / Yu, Yang Springer 2019 - 7
Many machine learning tasks involve solving complex optimization problems, such as working on non-differentiable, non-continuous, and non-unique objective functions; in some cases it can prove difficult to even define an explicit objective function. Evolutionary learning applies evolutionary algorithms to address optimization problems in machine learning, and has yielded encouraging outcomes in many applications. However, due to the heuristic nature of evolutionary optimization, most outcomes to date have been empirical and lack theoretical support. This shortcoming has kept evolutionary learning from being well received in the machine learning community, which favors solid theoretical approaches.
Recently there have been considerable efforts to address this issue. This book presents a range of those efforts, divided into four parts. Part I briefly introduces readers to evolutionary learning and provides some preliminaries, while Part II presents general theoretical tools for the analysis of running time and approximation performance in evolutionary algorithms. Based on these general tools, Part III presents a number of theoretical findings on major factors in evolutionary optimization, such as recombination, representation, inaccurate fitness evaluation, and population. In closing, Part IV addresses the development of evolutionary learning algorithms with provable theoretical guarantees for several representative tasks, in which evolutionary learning offers excellent performance.
2019年5月24日 在读 集成学习大牛周志华新书。至少我能感受到 ensemble 跟 EA(evolutionary algorithm) 还挺有联系。看到 IEEE Transactions on Evolutionary Computation 这个期刊的影响因子之高,也一定程度反映,很多机器学习的优化是真没办法,非要用上演化、群智能等启发式方法来优化。
Machine_Learning Optimization
Linear Algebra Done Right (3rd ed) 豆瓣
10.0 (8 个评分) 作者: Sheldon Axler Springer International Publishing 2014 - 11
New edition extensively revised and updated
Covers new topics such as product spaces, quotient spaces, and dual spaces
Features new visually appealing format for both print and electronic versions
Includes almost three times the number of exercises as the previous edition
This best-selling textbook for a second course in linear algebra is aimed at undergrad math majors and graduate students. The novel approach taken here banishes determinants to the end of the book. The text focuses on the central goal of linear algebra: understanding the structure of linear operators on finite-dimensional vector spaces. The author has taken unusual care to motivate concepts and to simplify proofs. A variety of interesting exercises in each chapter helps students understand and manipulate the objects of linear algebra.
The third edition contains major improvements and revisions throughout the book. More than 300 new exercises have been added since the previous edition. Many new examples have been added to illustrate the key ideas of linear algebra. New topics covered in the book include product spaces, quotient spaces, and dual spaces. Beautiful new formatting creates pages with an unusually pleasant appearance in both print and electronic versions.
No prerequisites are assumed other than the usual demand for suitable mathematical maturity. Thus the text starts by discussing vector spaces, linear independence, span, basis, and dimension. The book then deals with linear maps, eigenvalues, and eigenvectors. Inner-product spaces are introduced, leading to the finite-dimensional spectral theorem and its consequences. Generalized eigenvectors are then used to provide insight into the structure of a linear operator.
From reviews of previous editions:
“… a didactic masterpiece”
—Zentralblatt MATH
“… a tour de force in the service of simplicity and clarity … The most original linear algebra book to appear in years, it certainly belongs in every undergraduate library.”
—CHOICE
The determinant-free proofs are elegant and intuitive.
—American Mathematical Monthly
“Clarity through examples is emphasized … the text is ideal for class exercises … I congratulate the author and the publisher for a well-produced textbook on linear algebra.”
—Mathematical Reviews
Convex Optimization: Algorithms and Complexity 豆瓣
作者: Sébastien Bubeck Now Publishers Inc 2015
This monograph presents the main complexity theorems in convex optimization and their corresponding algorithms. It begins with the fundamental theory of black-box optimization and proceeds to guide the reader through recent advances in structural optimization and stochastic optimization. The presentation of black-box optimization, strongly influenced by the seminal book by Nesterov, includes the analysis of cutting plane methods, as well as (accelerated) gradient descent schemes. Special attention is also given to non-Euclidean settings (relevant algorithms include Frank-Wolfe, mirror descent, and dual averaging), and discussing their relevance in machine learning. The text provides a gentle introduction to structural optimization with FISTA (to optimize a sum of a smooth and a simple non-smooth term), saddle-point mirror prox (Nemirovski's alternative to Nesterov's smoothing), and a concise description of interior point methods. In stochastic optimization it discusses stochastic gradient descent, mini-batches, random coordinate descent, and sublinear algorithms. It also briefly touches upon convex relaxation of combinatorial problems and the use of randomness to round solutions, as well as random walks based methods.
2018年8月13日 在读 凸优化先读这个https://www.zhihu.com/question/68418633/answer/264232350
http://www.cs.cmu.edu/~aarti/Class/10725_Fall17/
Machine_Learning Optimization
Convex Optimization 豆瓣 Goodreads
Convex Optimization
8.0 (5 个评分) 作者: Stephen Boyd / Lieven Vandenberghe Cambridge University Press 2004 - 3
Convex optimization problems arise frequently in many different fields. This book provides a comprehensive introduction to the subject, and shows in detail how such problems can be solved numerically with great efficiency. The book begins with the basic elements of convex sets and functions, and then describes various classes of convex optimization problems. Duality and approximation techniques are then covered, as are statistical estimation techniques. Various geometrical problems are then presented, and there is detailed discussion of unconstrained and constrained minimization problems, and interior-point methods. The focus of the book is on recognizing convex optimization problems and then finding the most appropriate technique for solving them. It contains many worked examples and homework exercises and will appeal to students, researchers and practitioners in fields such as engineering, computer science, mathematics, statistics, finance and economics.
2018年8月13日 在读 angry!!! 没有凸优化基础根本看不了超一线的机器学习paper!!! https://web.stanford.edu/~boyd/cvxbook/bv_cvxbook.pdf
Machine_Learning Optimization