Statistics
Probabilistic Machine Learning: An Introduction 豆瓣
作者: Kevin P. Murphy The MIT Press 2022 - 3
A detailed and up-to-date introduction to machine learning, presented through the unifying lens of probabilistic modeling and Bayesian decision theory.
This book offers a detailed and up-to-date introduction to machine learning (including deep learning) through the unifying lens of probabilistic modeling and Bayesian decision theory. The book covers mathematical background (including linear algebra and optimization), basic supervised learning (including linear and logistic regression and deep neural networks), as well as more advanced topics (including transfer learning and unsupervised learning). End-of-chapter exercises allow students to apply what they have learned, and an appendix covers notation.
Probabilistic Machine Learning grew out of the author’s 2012 book, Machine Learning: A Probabilistic Perspective. More than just a simple update, this is a completely new book that reflects the dramatic developments in the field since 2012, most notably deep learning. In addition, the new book is accompanied by online Python code, using libraries such as scikit-learn, JAX, PyTorch, and Tensorflow, which can be used to reproduce nearly all the figures; this code can be run inside a web browser using cloud-based notebooks, and provides a practical complement to the theoretical topics discussed in the book. This introductory text will be followed by a sequel that covers more advanced topics, taking the same probabilistic approach.
Testing Statistical Hypotheses 豆瓣
作者: Erich L. Lehmann / Joseph P. Romano Springer 2008 - 9
The third edition of Testing Statistical Hypotheses updates and expands upon the classic graduate text, emphasizing optimality theory for hypothesis testing and confidence sets. The principal additions include a rigorous treatment of large sample optimality, together with the requisite tools. In addition, an introduction to the theory of resampling methods such as the bootstrap is developed. The sections on multiple testing and goodness of fit testing are expanded. The text is suitable for Ph.D. students in statistics and includes over 300 new problems out of a total of more than 760.
All of Nonparametric Statistics 豆瓣
作者: Larry Wasserman Springer 2010 - 11
This text provides the reader with a single book where they can find accounts of a number of up-to-date issues in nonparametric inference. The book is aimed at Masters or PhD level students in statistics, computer science, and engineering. It is also suitable for researchers who want to get up to speed quickly on modern nonparametric methods. It covers a wide range of topics including the bootstrap, the nonparametric delta method, nonparametric regression, density estimation, orthogonal function methods, minimax estimation, nonparametric confidence sets, and wavelets. The book,s dual approach includes a mixture of methodology and theory.
Probability and Computing 豆瓣
作者: Michael Mitzenmacher / Eli Upfal Cambridge University Press 2017 - 7
Greatly expanded, this new edition requires only an elementary background in discrete mathematics and offers a comprehensive introduction to the role of randomization and probabilistic techniques in modern computer science. Newly added chapters and sections cover topics including normal distributions, sample complexity, VC dimension, Rademacher complexity, power laws and related distributions, cuckoo hashing, and the Lovasz Local Lemma. Material relevant to machine learning and big data analysis enables students to learn modern techniques and applications. Among the many new exercises and examples are programming-related exercises that provide students with excellent training in solving relevant problems. This book provides an indispensable teaching tool to accompany a one- or two-semester course for advanced undergraduate students in computer science and applied mathematics.
The Probabilistic Method 豆瓣
作者: Noga Alon / Joel H. Spencer Wiley-Interscience 2008 - 8
Praise for the Second Edition : "Serious researchers in combinatorics or algorithm design will wish to read the book in its entirety...the book may also be enjoyed on a lighter level since the different chapters are largely independent and so it is possible to pick out gems in one's own area..."
— Formal Aspects of Computing This Third Edition of The Probabilistic Method reflects the most recent developments in the field while maintaining the standard of excellence that established this book as the leading reference on probabilistic methods in combinatorics. Maintaining its clear writing style, illustrative examples, and practical exercises, this new edition emphasizes methodology, enabling readers to use probabilistic techniques for solving problems in such fields as theoretical computer science, mathematics, and statistical physics. The book begins with a description of tools applied in probabilistic arguments, including basic techniques that use expectation and variance as well as the more recent applications of martingales and correlation inequalities. Next, the authors examine where probabilistic techniques have been applied successfully, exploring such topics as discrepancy and random graphs, circuit complexity, computational geometry, and derandomization of randomized algorithms. Sections labeled "The Probabilistic Lens" offer additional insights into the application of the probabilistic approach, and the appendix has been updated to include methodologies for finding lower bounds for Large Deviations. The Third Edition also features: A new chapter on graph property testing, which is a current topic that incorporates combinatorial, probabilistic, and algorithmic techniques An elementary approach using probabilistic techniques to the powerful Szemerédi Regularity Lemma and its applications New sections devoted to percolation and liar games A new chapter that provides a modern treatment of the Erdös-Rényi phase transition in the Random Graph Process Written by two leading authorities in the field, The Probabilistic Method , Third Edition is an ideal reference for researchers in combinatorics and algorithm design who would like to better understand the use of probabilistic methods. The book's numerous exercises and examples also make it an excellent textbook for graduate-level courses in mathematics and computer science.
Applied Multivariate Statistical Analysis 豆瓣
作者: Wolfgang Karl Härdle / Léopold Simar Springer 2015 - 2
Focusing on high-dimensional applications, this 4th edition presents the tools and concepts used in multivariate data analysis in a style that is also accessible for non-mathematicians and practitioners. It surveys the basic principles and emphasizes both exploratory and inferential statistics; a new chapter on Variable Selection (Lasso, SCAD and Elastic Net) has also been added. All chapters include practical exercises that highlight applications in different multivariate data analysis fields: in quantitative financial studies, where the joint dynamics of assets are observed; in medicine, where recorded observations of subjects in different locations form the basis for reliable diagnoses and medication; and in quantitative marketing, where consumers’ preferences are collected in order to construct models of consumer behavior. All of these examples involve high to ultra-high dimensions and represent a number of major fields in big data analysis.
The fourth edition of this book on Applied Multivariate Statistical Analysis offers the following new features:
A new chapter on Variable Selection (Lasso, SCAD and Elastic Net)
All exercises are supplemented by R and MATLAB code that can be found on www.quantlet.de.
The practical exercises include solutions that can be found in Härdle, W. and Hlavka, Z., Multivariate Statistics: Exercises and Solutions. Springer Verlag, Heidelberg.
Theoretical Statistics 豆瓣
作者: Robert W. Keener Springer 2010 - 9
Intended as the text for a sequence of advanced courses, this book covers major topics in theoretical statistics in a concise and rigorous fashion. The discussion assumes a background in advanced calculus, linear algebra, probability, and some analysis and topology. Measure theory is used, but the notation and basic results needed are presented in an initial chapter on probability, so prior knowledge of these topics is not essential. The presentation is designed to expose students to as many of the central ideas and topics in the discipline as possible, balancing various approaches to inference as well as exact, numerical, and large sample methods. Moving beyond more standard material, the book includes chapters introducing bootstrap methods, nonparametric regression, equivariant estimation, empirical Bayes, and sequential design and analysis. The book has a rich collection of exercises. Several of them illustrate how the theory developed in the book may be used in various applications. Solutions to many of the exercises are included in an appendix.
Large-Scale Inference 豆瓣
作者: Bradley Efron Cambridge University Press 2012 - 11
We live in a new age for statistical inference, where modern scientific technology such as microarrays and fMRI machines routinely produce thousands and sometimes millions of parallel data sets, each with its own estimation or testing problem. Doing thousands of problems at once is more than repeated application of classical methods. Taking an empirical Bayes approach, Bradley Efron, inventor of the bootstrap, shows how information accrues across problems in a way that combines Bayesian and frequentist ideas. Estimation, testing and prediction blend in this framework, producing opportunities for new methodologies of increased power. New difficulties also arise, easily leading to flawed inferences. This book takes a careful look at both the promise and pitfalls of large-scale statistical inference, with particular attention to false discovery rates, the most successful of the new statistical techniques. Emphasis is on the inferential ideas underlying technical developments, illustrated using a large number of real examples.
The Nature of Statistical Learning Theory 豆瓣
作者: Vladimir Vapnik Springer 1999 - 11
The aim of this book is to discuss the fundamental ideas which lie behind the statistical theory of learning and generalization. It considers learning as a general problem of function estimation based on empirical data. Omitting proofs and technical details, the author concentrates on discussing the main results of learning theory and their connections to fundamental problems in statistics. This second edition contains three new chapters devoted to further development of the learning theory and SVM techniques. Written in a readable and concise style, the book is intended for statisticians, mathematicians, physicists, and computer scientists.
Statistics for High-Dimensional Data 豆瓣
作者: Peter Bühlmann / Sara van de Geer Springer 2011 - 6
Modern statistics deals with large and complex data sets, and consequently with models containing a large number of parameters. This book presents a detailed account of recently developed approaches, including the Lasso and versions of it for various models, boosting methods, undirected graphical modeling, and procedures controlling false positive selections. A special characteristic of the book is that it contains comprehensive mathematical theory on high-dimensional statistics combined with methodology, algorithms and illustrations with real data examples. This in-depth approach highlights the methods’ great potential and practical applicability in a variety of settings. As such, it is a valuable resource for researchers, graduate students and experts in statistics, applied mathematics and computer science.
High-Dimensional Probability 豆瓣
作者: Roman Vershynin Cambridge University Press 2018 - 9
High-dimensional probability offers insight into the behavior of random vectors, random matrices, random subspaces, and objects used to quantify uncertainty in high dimensions. Drawing on ideas from probability, analysis, and geometry, it lends itself to applications in mathematics, statistics, theoretical computer science, signal processing, optimization, and more. It is the first to integrate theory, key tools, and modern applications of high-dimensional probability. Concentration inequalities form the core, and it covers both classical results such as Hoeffding's and Chernoff's inequalities and modern developments such as the matrix Bernstein's inequality. It then introduces the powerful methods based on stochastic processes, including such tools as Slepian's, Sudakov's, and Dudley's inequalities, as well as generic chaining and bounds based on VC dimension. A broad range of illustrations is embedded throughout, including classical and modern results for covariance estimation, clustering, networks, semidefinite programming, coding, dimension reduction, matrix completion, machine learning, compressed sensing, and sparse regression.
Distributed Optimization and Statistical Learning Via the Alternating Direction Method of Multipliers 豆瓣
作者: Stephen Boyd / Neal Parikh Now Publishers Inc 2011
https://web.stanford.edu/~boyd/papers/admm_distr_stats.html
Many problems of recent interest in statistics and machine learning can be posed in the framework of convex optimization. Due to the explosion in size and complexity of modern datasets, it is increasingly important to be able to solve problems with a very large number of features or training examples. As a result, both the decentralized collection or storage of these datasets as well as accompanying distributed solution methods are either necessary or at least highly desirable. In this review, we argue that the alternating direction method of multipliers is well suited to distributed convex optimization, and in particular to large-scale problems arising in statistics, machine learning, and related areas. The method was developed in the 1970s, with roots in the 1950s, and is equivalent or closely related to many other algorithms, such as dual decomposition, the method of multipliers, Douglas–Rachford splitting, Spingarn's method of partial inverses, Dykstra's alternating projections, Bregman iterative algorithms for ℓ1 problems, proximal methods, and others. After briefly surveying the theory and history of the algorithm, we discuss applications to a wide variety of statistical and machine learning problems of recent interest, including the lasso, sparse logistic regression, basis pursuit, covariance selection, support vector machines, and many others. We also discuss general distributed optimization, extensions to the nonconvex setting, and efficient implementation, including some details on distributed MPI and Hadoop Map Reduce implementations.
High-Dimensional Statistics 豆瓣 谷歌图书
作者: Martin J. Wainwright Cambridge University Press 2019 - 1
Recent years have witnessed an explosion in the volume and variety of data collected in all scientific disciplines and industrial settings. Such massive data sets present a number of challenges to researchers in statistics and machine learning. This book provides a self-contained introduction to the area of high-dimensional statistics, aimed at the first-year graduate level. It includes chapters that are focused on core methodology and theory - including tail bounds, concentration inequalities, uniform laws and empirical process, and random matrices - as well as chapters devoted to in-depth exploration of particular model classes - including sparse linear models, matrix models with rank constraints, graphical models, and various types of non-parametric models. With hundreds of worked examples and exercises, this text is intended both for courses and for self-study by graduate students and researchers in statistics, machine learning, and related fields who must understand, apply, and adapt modern statistical methods suited to large-scale data.
Statistical Rethinking 豆瓣
作者: Richard McElreath Chapman and Hall/CRC 2015
Statistical Rethinking: A Bayesian Course with Examples in R and Stan builds readers’ knowledge of and confidence in statistical modeling. Reflecting the need for even minor programming in today’s model-based statistics, the book pushes readers to perform step-by-step calculations that are usually automated. This unique computational approach ensures that readers understand enough of the details to make reasonable choices and interpretations in their own modeling work.
The text presents generalized linear multilevel models from a Bayesian perspective, relying on a simple logical interpretation of Bayesian probability and maximum entropy. It covers from the basics of regression to multilevel models. The author also discusses measurement error, missing data, and Gaussian process models for spatial and network autocorrelation.
By using complete R code examples throughout, this book provides a practical foundation for performing statistical inference. Designed for both PhD students and seasoned professionals in the natural and social sciences, it prepares them for more advanced or specialized statistical modeling.
Introduction to Nonparametric Estimation 豆瓣
作者: Alexandre B. Tsybakov Springer 2008
This is a concise text developed from lecture notes and ready to be used for a course on the graduate level. The main idea is to introduce the fundamental concepts of the theory while maintaining the exposition suitable for a first approach in the field. Therefore, the results are not always given in the most general form but rather under assumptions that lead to shorter or more elegant proofs. The book has three chapters. Chapter 1 presents basic nonparametric regression and density estimators and analyzes their properties. Chapter 2 is devoted to a detailed treatment of minimax lower bounds. Chapter 3 develops more advanced topics: Pinsker's theorem, oracle inequalities, Stein shrinkage, and sharp minimax adaptivity. This book will be useful for researchers and grad students interested in theoretical aspects of smoothing techniques. Many important and useful results on optimal and adaptive estimation are provided. As one of the leading mathematical statisticians working in nonparametrics, the author is an authority on the subject.