Machine_Learning
Probabilistic Machine Learning: An Introduction 豆瓣
作者: Kevin P. Murphy The MIT Press 2022 - 3
A detailed and up-to-date introduction to machine learning, presented through the unifying lens of probabilistic modeling and Bayesian decision theory.
This book offers a detailed and up-to-date introduction to machine learning (including deep learning) through the unifying lens of probabilistic modeling and Bayesian decision theory. The book covers mathematical background (including linear algebra and optimization), basic supervised learning (including linear and logistic regression and deep neural networks), as well as more advanced topics (including transfer learning and unsupervised learning). End-of-chapter exercises allow students to apply what they have learned, and an appendix covers notation.
Probabilistic Machine Learning grew out of the author’s 2012 book, Machine Learning: A Probabilistic Perspective. More than just a simple update, this is a completely new book that reflects the dramatic developments in the field since 2012, most notably deep learning. In addition, the new book is accompanied by online Python code, using libraries such as scikit-learn, JAX, PyTorch, and Tensorflow, which can be used to reproduce nearly all the figures; this code can be run inside a web browser using cloud-based notebooks, and provides a practical complement to the theoretical topics discussed in the book. This introductory text will be followed by a sequel that covers more advanced topics, taking the same probabilistic approach.
Generalized Principal Component Analysis 豆瓣
作者: René Vidal / Yi Ma Springer 2016 - 4
This book provides a comprehensive introduction to the latest advances in the mathematical theory and computational tools for modeling high-dimensional data drawn from one or multiple low-dimensional subspaces (or manifolds) and potentially corrupted by noise, gross errors, or outliers. This challenging task requires the development of new algebraic, geometric, statistical, and computational methods for efficient and robust estimation and segmentation of one or multiple subspaces. The book also presents interesting real-world applications of these new methods in image processing, image and video segmentation, face recognition and clustering, and hybrid system identification etc.
This book is intended to serve as a textbook for graduate students and beginning researchers in data science, machine learning, computer vision, image and signal processing, and systems theory. It contains ample illustrations, examples, and exercises and is made largely self-contained with three Appendices which survey basic concepts and principles from statistics, optimization, and algebraic-geometry used in this book.
Theoretical Statistics 豆瓣
作者: Robert W. Keener Springer 2010 - 9
Intended as the text for a sequence of advanced courses, this book covers major topics in theoretical statistics in a concise and rigorous fashion. The discussion assumes a background in advanced calculus, linear algebra, probability, and some analysis and topology. Measure theory is used, but the notation and basic results needed are presented in an initial chapter on probability, so prior knowledge of these topics is not essential. The presentation is designed to expose students to as many of the central ideas and topics in the discipline as possible, balancing various approaches to inference as well as exact, numerical, and large sample methods. Moving beyond more standard material, the book includes chapters introducing bootstrap methods, nonparametric regression, equivariant estimation, empirical Bayes, and sequential design and analysis. The book has a rich collection of exercises. Several of them illustrate how the theory developed in the book may be used in various applications. Solutions to many of the exercises are included in an appendix.
Explaining the Success of Nearest Neighbor Methods in Prediction 豆瓣
作者: George H. Chen / Devavrat Shah Now Publishers Inc 2018
George H. Chen and Devavrat Shah (2018), "Explaining the Success of Nearest Neighbor Methods in Prediction", Foundations and Trends® in Machine Learning: Vol. 10: No. 5-6, pp 337-588. http://dx.doi.org/10.1561/2200000064
https://www.nowpublishers.com/article/Details/MAL-064
https://devavrat.mit.edu/wp-content/uploads/2018/03/nn_survey.pdf
Many modern methods for prediction leverage nearest neighbor search to find past training examples most similar to a test example, an idea that dates back in text to at least the 11th century and has stood the test of time. This monograph aims to explain the success of these methods, both in theory, for which we cover foundational nonasymptotic statistical guarantees on nearest-neighbor-based regression and classification, and in practice, for which we gather prominent methods for approximate nearest neighbor search that have been essential to scaling prediction systems reliant on nearest neighbor analysis to handle massive datasets. Furthermore, we discuss connections to learning distances for use with nearest neighbor methods, including how random decision trees and ensemble methods learn nearest neighbor structure, as well as recent developments in crowdsourcing and graphons. In terms of theory, our focus is on nonasymptotic statistical guarantees, which we state in the form of how many training data and what algorithm parameters ensure that a nearest neighbor prediction method achieves a user-specified error tolerance. We begin with the most general of such results for nearest neighbor and related kernel regression and classification in general metric spaces. In such settings in which we assume very little structure, what enables successful prediction is smoothness in the function being estimated for regression, and a low probability of landing near the decision boundary for classification. In practice, these conditions could be difficult to verify empirically for a real dataset. We then cover recent theoretical guarantees on nearest neighbor prediction in the three case studies of time series forecasting, recommending products to people over time, and delineating human organs in medical images by looking at image patches. In these case studies, clustering structure, which is easier to verify in data and more readily interpretable by practitioners, enables successful prediction.
Graph Kernels: State-of-the-Art and Future Challenges 豆瓣
作者: Karsten Borgwardt / Elisabetta Ghisu Now Publishers Inc 2020
Karsten Borgwardt, Elisabetta Ghisu, Felipe Llinares-López, Leslie O’Bray and Bastian Rieck (2020), "Graph Kernels: State-of-the-Art and Future Challenges", Foundations and Trends® in Machine Learning: Vol. 13: No. 5-6, pp 531-712. http://dx.doi.org/10.1561/2200000076
https://www.nowpublishers.com/article/Details/MAL-076
https://arxiv.org/pdf/2011.03854.pdf
Among the data structures commonly used in machine learning, graphs are arguably one of the most general. Graphs allow the modelling of complex objects, each of which can be annotated by metadata. Nonetheless, seemingly simple questions, such as determining whether two graphs are identical or whether one graph is contained in another graph, are remarkably hard to solve in practice. Machine learning methods operating on graphs must therefore grapple with the need to balance computational tractability with the ability to leverage as much of the information conveyed by each graph as possible. In the last 15 years, numerous graph kernels have been proposed to solve this problem, thereby making it possible to perform predictions in both classification and regression settings.
This monograph provides a review of existing graph kernels, their applications, software plus data resources, and an empirical comparison of state-of-the-art graph kernels. It is divided into two parts: the first part focuses on the theoretical description of common graph kernels; the second part focuses on a large-scale empirical evaluation of graph kernels, as well as a description of desirable properties and requirements for benchmark data sets. Finally, the authors outline the future trends and open challenges for graph kernels.
Written for every researcher, practitioner and student of machine learning, Graph Kernels provides a comprehensive and insightful survey of the various graph kernals available today. It gives the reader a detailed typology, and analysis of relevant graph kernels while exposing the relations between them and commenting on their applicability for specific data types. There is also a large-scale empirical evaluation of graph kernels.
Non-convex Optimization for Machine Learning 豆瓣
作者: Prateek Jain / Purushottam Kar Now Publishers Inc 2017
Prateek Jain and Purushottam Kar (2017), "Non-convex Optimization for Machine Learning", Foundations and Trends® in Machine Learning: Vol. 10: No. 3-4, pp 142-363. http://dx.doi.org/10.1561/2200000058
https://www.nowpublishers.com/article/Details/MAL-058
https://www.prateekjain.org/publications/all_papers/JainK17_FTML.pdf
Non-convex Optimization for Machine Learning takes an in-depth look at the basics of non-convex optimization with applications to machine learning. It introduces the rich literature in this area, as well as equipping the reader with the tools and techniques needed to analyze these simple procedures for non-convex problems.
Non-convex Optimization for Machine Learning is as self-contained as possible while not losing focus of the main topic of non-convex optimization techniques. Entire chapters are devoted to present a tutorial-like treatment of basic concepts in convex analysis and optimization, as well as their non-convex counterparts. As such, this monograph can be used for a semester-length course on the basics of non-convex optimization with applications to machine learning. On the other hand, it is also possible to cherry pick individual portions, such the chapter on sparse recovery, or the EM algorithm, for inclusion in a broader course. Several courses such as those in machine learning, optimization, and signal processing may benefit from the inclusion of such topics.
Non-convex Optimization for Machine Learning concludes with a look at four interesting applications in the areas of machine learning and signal processing and explores how the non-convex optimization techniques introduced earlier can be used to solve these problems.
Graph Representation Learning 豆瓣
作者: William L. Hamilton Morgan & Claypool 2020 - 9
Graph-structured data is ubiquitous throughout the natural and social sciences, from telecommunication networks to quantum chemistry. Building relational inductive biases into deep learning architectures is crucial for creating systems that can learn, reason, and generalize from this kind of data. Recent years have seen a surge in research on graph representation learning, including techniques for deep graph embeddings, generalizations of convolutional neural networks to graph-structured data, and neural message-passing approaches inspired by belief propagation. These advances in graph representation learning have led to new state-of-the-art results in numerous domains, including chemical synthesis, 3D vision, recommender systems, question answering, and social network analysis.
This book provides a synthesis and overview of graph representation learning. It begins with a discussion of the goals of graph representation learning as well as key methodological foundations in graph theory and network analysis. Following this, the book introduces and reviews methods for learning node embeddings, including random-walk-based methods and applications to knowledge graphs. It then provides a technical synthesis and introduction to the highly successful graph neural network (GNN) formalism, which has become a dominant and fast-growing paradigm for deep learning with graph data. The book concludes with a synthesis of recent advancements in deep generative models for graphs--a nascent but quickly growing subset of graph representation learning.
Variational Bayesian Learning Theory 豆瓣
作者: Shinichi Nakajima / Kazuho Watanabe Cambridge University Press 2019 - 8
Designed for researchers and graduate students in machine learning, this book introduces the theory of variational Bayesian learning, a popular machine learning method, and suggests how to make use of it in practice. Detailed derivations allow readers to follow along without prior knowledge of the specific mathematical techniques.
Statistics for High-Dimensional Data 豆瓣
作者: Peter Bühlmann / Sara van de Geer Springer 2011 - 6
Modern statistics deals with large and complex data sets, and consequently with models containing a large number of parameters. This book presents a detailed account of recently developed approaches, including the Lasso and versions of it for various models, boosting methods, undirected graphical modeling, and procedures controlling false positive selections. A special characteristic of the book is that it contains comprehensive mathematical theory on high-dimensional statistics combined with methodology, algorithms and illustrations with real data examples. This in-depth approach highlights the methods’ great potential and practical applicability in a variety of settings. As such, it is a valuable resource for researchers, graduate students and experts in statistics, applied mathematics and computer science.
Determinantal Point Processes for Machine Learning 豆瓣
作者: Alex Kulesza / Ben Taskar Now Publishers Inc 2012
Alex Kulesza and Ben Taskar (2012), "Determinantal Point Processes for Machine Learning", Foundations and Trends® in Machine Learning: Vol. 5: No. 2–3, pp 123-286. http://dx.doi.org/10.1561/2200000044
https://www.nowpublishers.com/article/Details/MAL-044
http://www.alexkulesza.com/pubs/dpps_fnt12.pdf
Determinantal point processes (DPPs) are elegant probabilistic models of repulsion that arise in quantum physics and random matrix theory. In contrast to traditional structured models like Markov random fields, which become intractable and hard to approximate in the presence of negative correlations, DPPs offer efficient and exact algorithms for sampling, marginalization, conditioning, and other inference tasks. While they have been studied extensively by mathematicians, giving rise to a deep and beautiful theory, DPPs are relatively new in machine learning. Determinantal Point Processes for Machine Learning provides a comprehensible introduction to DPPs, focusing on the intuitions, algorithms, and extensions that are most relevant to the machine learning community, and shows how DPPs can be applied to real-world applications like finding diverse sets of high-quality search results, building informative summaries by selecting diverse sentences from documents, modeling non-overlapping human poses in images or video, and automatically building timelines of important news stories. It presents the general mathematical background to DPPs along with a range of modeling extensions, efficient algorithms, and theoretical results that aim to enable practical modeling and learning.
High-Dimensional Probability 豆瓣
作者: Roman Vershynin Cambridge University Press 2018 - 9
High-dimensional probability offers insight into the behavior of random vectors, random matrices, random subspaces, and objects used to quantify uncertainty in high dimensions. Drawing on ideas from probability, analysis, and geometry, it lends itself to applications in mathematics, statistics, theoretical computer science, signal processing, optimization, and more. It is the first to integrate theory, key tools, and modern applications of high-dimensional probability. Concentration inequalities form the core, and it covers both classical results such as Hoeffding's and Chernoff's inequalities and modern developments such as the matrix Bernstein's inequality. It then introduces the powerful methods based on stochastic processes, including such tools as Slepian's, Sudakov's, and Dudley's inequalities, as well as generic chaining and bounds based on VC dimension. A broad range of illustrations is embedded throughout, including classical and modern results for covariance estimation, clustering, networks, semidefinite programming, coding, dimension reduction, matrix completion, machine learning, compressed sensing, and sparse regression.
Distributed Optimization and Statistical Learning Via the Alternating Direction Method of Multipliers 豆瓣
作者: Stephen Boyd / Neal Parikh Now Publishers Inc 2011
https://web.stanford.edu/~boyd/papers/admm_distr_stats.html
Many problems of recent interest in statistics and machine learning can be posed in the framework of convex optimization. Due to the explosion in size and complexity of modern datasets, it is increasingly important to be able to solve problems with a very large number of features or training examples. As a result, both the decentralized collection or storage of these datasets as well as accompanying distributed solution methods are either necessary or at least highly desirable. In this review, we argue that the alternating direction method of multipliers is well suited to distributed convex optimization, and in particular to large-scale problems arising in statistics, machine learning, and related areas. The method was developed in the 1970s, with roots in the 1950s, and is equivalent or closely related to many other algorithms, such as dual decomposition, the method of multipliers, Douglas–Rachford splitting, Spingarn's method of partial inverses, Dykstra's alternating projections, Bregman iterative algorithms for ℓ1 problems, proximal methods, and others. After briefly surveying the theory and history of the algorithm, we discuss applications to a wide variety of statistical and machine learning problems of recent interest, including the lasso, sparse logistic regression, basis pursuit, covariance selection, support vector machines, and many others. We also discuss general distributed optimization, extensions to the nonconvex setting, and efficient implementation, including some details on distributed MPI and Hadoop Map Reduce implementations.
High-Dimensional Statistics 豆瓣 谷歌图书
作者: Martin J. Wainwright Cambridge University Press 2019 - 1
Recent years have witnessed an explosion in the volume and variety of data collected in all scientific disciplines and industrial settings. Such massive data sets present a number of challenges to researchers in statistics and machine learning. This book provides a self-contained introduction to the area of high-dimensional statistics, aimed at the first-year graduate level. It includes chapters that are focused on core methodology and theory - including tail bounds, concentration inequalities, uniform laws and empirical process, and random matrices - as well as chapters devoted to in-depth exploration of particular model classes - including sparse linear models, matrix models with rank constraints, graphical models, and various types of non-parametric models. With hundreds of worked examples and exercises, this text is intended both for courses and for self-study by graduate students and researchers in statistics, machine learning, and related fields who must understand, apply, and adapt modern statistical methods suited to large-scale data.
Mathematics for Machine Learning 豆瓣
作者: Marc Peter Deisenroth / A. Aldo Faisal Cambridge University Press 2020 - 1
https://mml-book.github.io/
::This self-contained textbook introduces all the relevant mathematical concepts needed to understand and use machine learning methods, with a minimum of prerequisites. Topics include linear algebra, analytic geometry, matrix decompositions, vector calculus, optimization, probability and statistics::
The fundamental mathematical tools needed to understand machine learning include linear algebra, analytic geometry, matrix decompositions, vector calculus, optimization, probability and statistics. These topics are traditionally taught in disparate courses, making it hard for data science or computer science students, or professionals, to efficiently learn the mathematics. This self-contained textbook bridges the gap between mathematical and machine learning texts, introducing the mathematical concepts with a minimum of prerequisites. It uses these concepts to derive four central machine learning methods: linear regression, principal component analysis, Gaussian mixture models and support vector machines. For students and others with a mathematical background, these derivations provide a starting point to machine learning texts. For those learning the mathematics for the first time, the methods help build intuition and practical experience with applying mathematical concepts. Every chapter includes worked examples and exercises to test understanding. Programming tutorials are offered on the book's web site.
2019年8月21日 在读 对于理解机器学习来说极限少的数学储备 (2020年出版 现已开源 https://mml-book.github.io/)
Machine_Learning
Generative Deep Learning 豆瓣 Goodreads
作者: David Foster O'Reilly Media 2019 - 7
Generative modeling is one of the hottest topics in artificial intelligence. Recent advances in the field have shown how it’s possible to teach a machine to excel at human endeavors—such as drawing, composing music, and completing tasks—by generating an understanding of how its actions affect its environment.
With this practical book, machine learning engineers and data scientists will learn how to recreate some of the most famous examples of generative deep learning models, such as variational autoencoders and generative adversarial networks (GANs). You’ll also learn how to apply the techniques to your own datasets.
David Foster, cofounder of Applied Data Science, demonstrates the inner workings of each technique, starting with the basics of deep learning before advancing to the most cutting-edge algorithms in the field. Through tips and tricks, you’ll learn how to make your models learn more efficiently and become more creative.
Get a fundamental overview of deep learning
Learn about libraries such as Keras and TensorFlow
Discover how variational autoencoders work
Get practical examples of generative adversarial networks (GANs)
Understand how autoregressive generative models function
Apply generative models within a reinforcement learning setting to accomplish tasks
Matrix Methods in Data Mining and Pattern Recognition (Fundamentals of Algorithms) 豆瓣
作者: Lars Eldén Society for Industrial and Applied Mathematics 2007 - 4
Several very powerful numerical linear algebra techniques are available for solving problems in data mining and pattern recognition. This application-oriented book describes how modern matrix methods can be used to solve these problems, gives an introduction to matrix theory and decompositions, and provides students with a set of tools that can be modified for a particular application. Part I gives a short introduction to a few application areas before presenting linear algebra concepts and matrix decompositions that students can use in problem-solving environments such as MATLAB. In Part II, linear algebra techniques are applied to data mining problems. Part III is a brief introduction to eigenvalue and singular value algorithms. The applications discussed include classification of handwritten digits, text mining, text summarization, pagerank computations related to the Google search engine, and face recognition. Exercises and computer assignments are available on a Web page that supplements the book.
Evolutionary Learning: Advances in Theories and Algorithms 豆瓣
演化学习:理论与算法进展
作者: Zhou, Zhi-Hua / Yu, Yang Springer 2019 - 7
Many machine learning tasks involve solving complex optimization problems, such as working on non-differentiable, non-continuous, and non-unique objective functions; in some cases it can prove difficult to even define an explicit objective function. Evolutionary learning applies evolutionary algorithms to address optimization problems in machine learning, and has yielded encouraging outcomes in many applications. However, due to the heuristic nature of evolutionary optimization, most outcomes to date have been empirical and lack theoretical support. This shortcoming has kept evolutionary learning from being well received in the machine learning community, which favors solid theoretical approaches.
Recently there have been considerable efforts to address this issue. This book presents a range of those efforts, divided into four parts. Part I briefly introduces readers to evolutionary learning and provides some preliminaries, while Part II presents general theoretical tools for the analysis of running time and approximation performance in evolutionary algorithms. Based on these general tools, Part III presents a number of theoretical findings on major factors in evolutionary optimization, such as recombination, representation, inaccurate fitness evaluation, and population. In closing, Part IV addresses the development of evolutionary learning algorithms with provable theoretical guarantees for several representative tasks, in which evolutionary learning offers excellent performance.
2019年5月24日 在读 集成学习大牛周志华新书。至少我能感受到 ensemble 跟 EA(evolutionary algorithm) 还挺有联系。看到 IEEE Transactions on Evolutionary Computation 这个期刊的影响因子之高,也一定程度反映,很多机器学习的优化是真没办法,非要用上演化、群智能等启发式方法来优化。
Machine_Learning Optimization
Matrix Algebra 豆瓣
作者: Karim M. Abadir / Jan R. Magnus Cambridge University Press 2005 - 8
Matrix Algebra is the first volume of the Econometric Exercises Series. It contains exercises relating to course material in matrix algebra that students are expected to know while enrolled in an (advanced) undergraduate or a postgraduate course in econometrics or statistics. The book contains a comprehensive collection of exercises, all with full answers. But the book is not just a collection of exercises; in fact, it is a textbook, though one that is organized in a completely different manner than the usual textbook. The volume can be used either as a self-contained course in matrix algebra or as a supplementary text.
Deep Learning through Sparse and Low-Rank Modeling 豆瓣
作者: Zhangyang Wang / Yun Fu Academic Press 2019 - 4
https://www.elsevier.com/books/deep-learning-through-sparse-and-low-rank-modeling/wang/978-0-12-813659-1
Description:
Deep Learning through Sparse Representation and Low-Rank Modeling bridges classical sparse and low rank models—those that emphasize problem-specific Interpretability—with recent deep network models that have enabled a larger learning capacity and better utilization of Big Data. It shows how the toolkit of deep learning is closely tied with the sparse/low rank methods and algorithms, providing a rich variety of theoretical and analytic tools to guide the design and interpretation of deep learning models. The development of the theory and models is supported by a wide variety of applications in computer vision, machine learning, signal processing, and data mining.
This book will be highly useful for researchers, graduate students and practitioners working in the fields of computer vision, machine learning, signal processing, optimization and statistics.
Key Features:
Combines classical sparse and low-rank models and algorithms with the latest advances in deep learning networks
Shows how the structure and algorithms of sparse and low-rank methods improves the performance and interpretability of Deep Learning models
Provides tactics on how to build and apply customized deep learning models for various applications
Readership:
Researchers and graduate students in computer vision, machine learning, signal processing, optimization, and statistics
2019年4月23日 在读 已上传 http://booksdescr.org/item/index.php?md5=167383D00A7B6D3B368DCEA48960BD30
Clustering Machine_Learning
Graph Embedding for Pattern Analysis 豆瓣
作者: Yun Fu / Yunqian Ma 2013
Graph Embedding for Pattern Recognition covers theory methods, computation, and applications widely used in statistics, machine learning, image processing, and computer vision. This book presents the latest advances in graph embedding theories, such as nonlinear manifold graph, linearization method, graph based subspace analysis, L1 graph, hypergraph, undirected graph, and graph in vector spaces. Real-world applications of these theories are spanned broadly in dimensionality reduction, subspace learning, manifold learning, clustering, classification, and feature selection. A selective group of experts contribute to different chapters of this book which provides a comprehensive perspective of this field.