机器学习
Parallel Distributed Processing, Vol. 1 豆瓣
作者: David E. Rumelhart / James L. McClelland 出版社: A Bradford Book 1987 - 7
What makes people smarter than computers? These volumes by a pioneering neurocomputing group suggest that the answer lies in the massively parallel architecture of the human mind. They describe a new theory of cognition called connectionism that is challenging the idea of symbolic computation that has traditionally been at the center of debate in theoretical discussions about the mind. The authors' theory assumes the mind is composed of a great number of elementary units connected in a neural network. Mental processes are interactions between these units which excite and inhibit each other in parallel rather than sequential operations. In this context, knowledge can no longer be thought of as stored in localized structures; instead, it consists of the connections between pairs of units that are distributed throughout the network. Volume 1 lays the foundations of this exciting theory of parallel distributed processing, while Volume 2 applies it to a number of specific issues in cognitive science and neuroscience, with chapters describing models of aspects of perception, memory, language, and thought.
Data Analysis 豆瓣
作者: Devinderjit Sivia / John Skilling 出版社: Oxford University Press 2006 - 7
Statistics lectures have been a source of much bewilderment and frustration for generations of students. This book attempts to remedy the situation by expounding a logical and unified approach to the whole subject of data analysis.
This text is intended as a tutorial guide for senior undergraduates and research students in science and engineering. After explaining the basic principles of Bayesian probability theory, their use is illustrated with a variety of examples ranging from elementary parameter estimation to image processing. Other topics covered include reliability analysis, multivariate optimization, least-squares and maximum likelihood, error-propagation, hypothesis testing, maximum entropy and experimental design.
The Second Edition of this successful tutorial book contains a new chapter on extensions to the ubiquitous least-squares procedure, allowing for the straightforward handling of outliers and unknown correlated noise, and a cutting-edge contribution from John Skilling on a novel numerical technique for Bayesian computation called 'nested sampling'.
概率图模型:原理与技术 豆瓣
作者: [美]Daphne Koller / [以色列]Nir Friedman 译者: 王飞跃 / 韩素青 出版社: 清华大学出版社 2015 - 3
概率图模型将概率论与图论相结合,是当前非常热门的一个机器学习研究方向。本书详细论述了有向图模型(又称贝叶斯网)和无向图模型(又称马尔可夫网)的表示、推理和学习问题,全面总结了人工智能这一前沿研究领域的最新进展。为了便于读者理解,书中包含了大量的定义、定理、证明、算法及其伪代码,穿插了大量的辅助材料,如示例(examples)、技巧专栏(skill boxes)、实例专栏(case study boxes)、概念专栏(concept boxes)等。另外,在第 2章介绍了概率论和图论的核心知识,在附录中介绍了信息论、算法复杂性、组合优化等补充材料,为学习和运用概率图模型提供了完备的基础。
本书可作为高等学校和科研单位从事人工智能、机器学习、模式识别、信号处理等方向的学生、教师和研究人员的教材和参考书。
== 序 言 ==
很高兴能够看到我们所著的《概率图模型》一书被翻译为中文出版。我们了解到这本书涵盖的课题已在中国引起了巨大的兴趣。已有众多中国读者写信向我们解释这本书对于他们的学习的重要性,并希望获得更易理解的版本。随着众多来自中国研究机构或国外研究机构的中国学者署名或共同署名的文章的发表,中国研究者已在概率图领域中扮演了非常重要的角色。这些文章对于概率图模型领域的发展起到了非常重要的作用。我们相信《概率图模型》中文版的出版将帮助许多中国读者学习并掌握这一重要课题的基础。同时,这也将进一步提高中国学者应用概率图模型思想的能力,并为这一领域的发展做出贡献。
本书的翻译工作由王飞跃研究员主导,并得到了王珏研究员及其众多助手和合作者的支持。这是一份历时 5年、具有里程碑意义的努力,我深深地感谢该团队所有为本书翻译做出贡献的人员。我尤其希望借此机会感谢王珏研究员——一位中国机器学习领域的开拓者。王珏研究员是此项翻译工作的十分重要的推动者。没有他的支持,没有他的众多杰出的机器学习领域的学生的帮助,可能这项工作到现在还没有结果。很遗憾王珏研究员于 2014年 12月死于癌症,终年 66岁,已不能看到他努力的结果。然而,他的思想活在他的学生们的工作中,与本书的出版同在。
Daphne Koller
(复杂系统管理与控制国家重点实验室王晓翻译)
Learning with Kernels 豆瓣
作者: Bernhard Schlkopf / Alexander J. Smola 出版社: The MIT Press 2001
In the 1990s, a new type of learning algorithm was developed, based on results from statistical learning theory: the Support Vector Machine (SVM). This gave rise to a new class of theoretically elegant learning machines that use a central concept of SVMs -- -kernels--for a number of learning tasks. Kernel machines provide a modular framework that can be adapted to different tasks and domains by the choice of the kernel function and the base algorithm. They are replacing neural networks in a variety of fields, including engineering, information retrieval, and bioinformatics.Learning with Kernels provides an introduction to SVMs and related kernel methods. Although the book begins with the basics, it also includes the latest research. It provides all of the concepts necessary to enable a reader equipped with some basic mathematical knowledge to enter the world of machine learning using theoretically well-founded yet easy-to-use kernel algorithms and to understand and apply the powerful algorithms that have been developed over the last few years.
Python神经网络编程 豆瓣
Make Your Own Neural Network
8.5 (12 个评分) 作者: [英]塔里克·拉希德(Tariq Rashid) 译者: 林赐 出版社: 人民邮电出版社 2018 - 4
神经网络是一种模拟人脑的神经网络,以期能够实现类人工智能的机器学习
技术。
本书揭示神经网络背后的概念,并介绍如何通过Python实现神经网络。全书
分为3章和两个附录。第1章介绍了神经网络中所用到的数学思想。第2章介绍使
用Python实现神经网络,识别手写数字,并测试神经网络的性能。第3章带领读
者进一步了解简单的神经网络,观察已受训练的神经网络内部,尝试进一步改善
神经网络的性能,并加深对相关知识的理解。附录分别介绍了所需的微积分知识
和树莓派知识。
本书适合想要从事神经网络研究和探索的读者学习参考,也适合对人工智
能、机器学习和深度学习等相关领域感兴趣的读者阅读。
自然语言处理综论(第二版) 豆瓣 Goodreads
作者: 冯志伟 / Daniel Jurafsky(D. 朱夫斯凯) 译者: 冯志伟 / 孙乐 出版社: 电子工业出版社 2018 - 3
从本书第一版出版以来,一直好评如潮,被国外许多大学选作自然语言处理或计算语言学的教材,被认为该领域教材的“黄金标准”。
本书第一版综合了自然语言处理、计算语言学和语音识别的内容,全面论述计算机自然语言处理,深入探讨计算机处理自然语言的词汇、句法、语义、语用等各个方面的问题,介绍了自然语言处理的各种现代技术。该版对于第一版做了全面的改写,增加了大量反映自然语言处理最新成就的内容,特别是增加了语音处理和统计技术方面的内容,全书面貌为之一新。本书四大特色: 覆盖全面 强调实用 注重评测 语料为本内容简介本书全面论述了自然语言处理技术。
本书在第一版的基础上增加了自然语言处理的最新成就,特别是增加了语音处理和统计技术方面的内容,全书面貌为之一新。本书共分五个部分。第一部分“词汇的计算机处理”,讲述单词的计算机处理,包括单词切分、单词的形态学、最小编辑距离、词类,以及单词计算机处理的各种算法,包括正则表达式、有限状态自动机、有限状态转录机、N元语法模型、隐马尔可夫模型、最大熵模型等。第二部分“语音的计算机处理”,介绍语音学、语音合成、语音自动识别以及计算音系学。第三部分“句法的计算机处理”,介绍英语的形式语法,讲述句法剖析的主要算法,包括CKY剖析算法、Earley剖析算法、统计剖析,并介绍合一与类型特征结构、Chomsky层级分类、抽吸引理等分析工具。第四部分“语义和语用的计算机处理”,介绍语义的各种表示方法、计算语义学、词汇语义学、计算词汇语义学,并介绍同指、连贯等计算机话语分析问题。第五部分“应用”,讲述信息抽取、问答系统、自动文摘、对话和会话智能代理、机器翻译等自然语言处理的应用技术。本书写作风格深入浅出,实例丰富,引人入胜。本书可作为高等学校自然语言处理或计算语言学的本科生和研究生的教材,也可以作为从事人工智能、自然语言处理等领域的研究人员和技术人员的必备参考。
深度学习入门 豆瓣 Goodreads 谷歌图书
Deep Learning from Scratch
9.4 (21 个评分) 作者: [ 日] 斋藤康毅 译者: 陆宇杰 出版社: 人民邮电出版社 2018 - 7
本书是深度学习真正意义上的入门书,深入浅出地剖析了深度学习的原理和相关技术。书中使用Python3,尽量不依赖外部库或工具,从基本的数学知识出发,带领读者从零创建一个经典的深度学习网络,使读者在此过程中逐步理解深度学习。书中不仅介绍了深度学习和神经网络的概念、特征等基础知识,对误差反向传播法、卷积神经网络等也有深入讲解,此外还介绍了深度学习相关的实用技巧,自动驾驶、图像生成、强化学习等方面的应用,以及为什么加深层可以提高识别精度等“为什么”的问题。
精通数据科学:从线性回归到深度学习 豆瓣
作者: 唐亘 出版社: 人民邮电出版社 2018 - 5
数据科学是一门内涵很广的学科,它涉及到统计分析、机器学习以及计算机科学三方面的知识和技能。本书深入浅出、全面系统地介绍了这门学科的内容。
本书分为13章,最初的3章主要介绍数据科学想要解决的问题、常用的IT工具Python以及这门学科所涉及的数学基础。第4-7章主要讨论数据模型,主要包含三方面的内容:一是统计中最经典的线性回归和逻辑回归模型;二是计算机估算模型参数的随机梯度下降法,这是模型工程实现的基础;三是来自计量经济学的启示,主要涉及特征提取的方法以及模型的稳定性。接下来的8-10章主要讨论算法模型,也就是机器学习领域比较经典的模型。这三章依次讨论了监督式学习、生成式模型以及非监督式学习。目前数据科学最前沿的两个领域分别是大数据和人工智能。本书的第11章将介绍大数据中很重要的分布式机器学习,而本书的最后两章将讨论人工智能领域的神经网络和深度学习。
本书通俗易懂,而且理论和实践相结合,可作为数据科学家和数据工程师的学习用书,也适合对数学科学有强烈兴趣的初学者使用。同时也可作为高等院校计算机、数学及相关专业的师生用书和培训学校的教材。
计算广告 豆瓣
8.0 (7 个评分) 作者: 刘鹏 / 王超 出版社: 人民邮电出版社 2015 - 9
计算广告是一项新兴的研究课题,它涉及大规模搜索和文本分析、信息获取、统计模型、机器学习、分类、优化以及微观经济学等诸多领域的知识。本书从实践出发,系统地介绍计算广告的产品、问题、系统和算法,并且从工业界的视角对这一领域具体技术的深入剖析。
本书立足于广告市场的根本问题,从计算广告各个阶段所遇到的市场挑战出发,以广告系统业务形态的需求和变化为主线,依次介绍合约广告系统、竞价广告系统、程序化交易市场等重要课题,并对计算广告涉及的关键技术和算法做深入的探讨。
无论是互联网公司商业化部门的产品技术人员,还是对个性化系统、大数据变现或交易有兴趣的产品技术人员,传统企业互联网化进程的决策者,传统广告业务的从业者,互联网创业者,计算机相关专业研究生,都会从阅读本书中受益匪浅。
本文仅用于学习和交流目的,不代表异步社区观点。非商业转载请注明作译者、出处,并保留本文的原始链接。
Deep Learning with Python 豆瓣
作者: Francois Chollet 出版社: Manning Publications 2017 - 10
Deep Learning with Python introduces the field of deep learning using the Python language and the powerful Keras library. Written by Keras creator and Google AI researcher François Chollet, this book builds your understanding through intuitive explanations and practical examples. You'll explore challenging concepts and practice with applications in computer vision, natural-language processing, and generative models. By the time you finish, you'll have the knowledge and hands-on skills to apply deep learning in your own projects.
Learning From Data 豆瓣
10.0 (7 个评分) 作者: Yaser S. Abu-Mostafa / Malik Magdon-Ismail 出版社: AMLBook 2012 - 3
Machine learning allows computational systems to adaptively improve their performance with experience accumulated from the observed data. Its techniques are widely applied in engineering, science, finance, and commerce. This book is designed for a short course on machine learning. It is a short course, not a hurried course. From over a decade of teaching this material, we have distilled what we believe to be the core topics that every student of the subject should know. We chose the title `learning from data' that faithfully describes what the subject is about, and made it a point to cover the topics in a story-like fashion. Our hope is that the reader can learn all the fundamentals of the subject by reading the book cover to cover. ---- Learning from data has distinct theoretical and practical tracks. In this book, we balance the theoretical and the practical, the mathematical and the heuristic. Our criterion for inclusion is relevance. Theory that establishes the conceptual framework for learning is included, and so are heuristics that impact the performance of real learning systems. ---- Learning from data is a very dynamic field. Some of the hot techniques and theories at times become just fads, and others gain traction and become part of the field. What we have emphasized in this book are the necessary fundamentals that give any student of learning from data a solid foundation, and enable him or her to venture out and explore further techniques and theories, or perhaps to contribute their own. ---- The authors are professors at California Institute of Technology (Caltech), Rensselaer Polytechnic Institute (RPI), and National Taiwan University (NTU), where this book is the main text for their popular courses on machine learning. The authors also consult extensively with financial and commercial companies on machine learning applications, and have led winning teams in machine learning competitions.
Information Theory, Inference and Learning Algorithms 豆瓣 Goodreads
Information Theory, Inference & Learning Algorithms
10.0 (5 个评分) 作者: David J. C. MacKay 出版社: Cambridge University Press 2003 - 10
Information theory and inference, taught together in this exciting textbook, lie at the heart of many important areas of modern technology - communication, signal processing, data mining, machine learning, pattern recognition, computational neuroscience, bioinformatics and cryptography. The book introduces theory in tandem with applications. Information theory is taught alongside practical communication systems such as arithmetic coding for data compression and sparse-graph codes for error-correction. Inference techniques, including message-passing algorithms, Monte Carlo methods and variational approximations, are developed alongside applications to clustering, convolutional codes, independent component analysis, and neural networks. Uniquely, the book covers state-of-the-art error-correcting codes, including low-density-parity-check codes, turbo codes, and digital fountain codes - the twenty-first-century standards for satellite communications, disk drives, and data broadcast. Richly illustrated, filled with worked examples and over 400 exercises, some with detailed solutions, the book is ideal for self-learning, and for undergraduate or graduate courses. It also provides an unparalleled entry point for professionals in areas as diverse as computational biology, financial engineering and machine learning.
Applied Predictive Modeling 豆瓣 Goodreads
作者: Max Kuhn / Kjell Johnson 出版社: Springer 2013 - 9
This text is intended for a broad audience as both an introduction to predictive models as well as a guide to applying them. Non-mathematical readers will appreciate the intuitive explanations of the techniques while an emphasis on problem-solving with real data across a wide variety of applications will aid practitioners who wish to extend their expertise. Readers should have knowledge of basic statistical ideas, such as correlation and linear regression analysis. While the text is biased against complex equations, a mathematical background is needed for advanced topics. Dr. Kuhn is a Director of Non-Clinical Statistics at Pfizer Global R&D in Groton Connecticut. He has been applying predictive models in the pharmaceutical and diagnostic industries for over 15 years and is the author of a number of R packages. Dr. Johnson has more than a decade of statistical consulting and predictive modeling experience in pharmaceutical research and development. He is a co-founder of Arbor Analytics, a firm specializing in predictive modeling and is a former Director of Statistics at Pfizer Global R&D. His scholarly work centers on the application and development of statistical methodology and learning algorithms.
Statistical Rethinking 豆瓣
作者: Richard McElreath 出版社: Chapman and Hall/CRC 2015
Statistical Rethinking: A Bayesian Course with Examples in R and Stan builds readers’ knowledge of and confidence in statistical modeling. Reflecting the need for even minor programming in today’s model-based statistics, the book pushes readers to perform step-by-step calculations that are usually automated. This unique computational approach ensures that readers understand enough of the details to make reasonable choices and interpretations in their own modeling work.
The text presents generalized linear multilevel models from a Bayesian perspective, relying on a simple logical interpretation of Bayesian probability and maximum entropy. It covers from the basics of regression to multilevel models. The author also discusses measurement error, missing data, and Gaussian process models for spatial and network autocorrelation.
By using complete R code examples throughout, this book provides a practical foundation for performing statistical inference. Designed for both PhD students and seasoned professionals in the natural and social sciences, it prepares them for more advanced or specialized statistical modeling.
集体智慧编程 豆瓣
Programming Collective Intelligence
8.0 (17 个评分) 作者: Toby Segaran 译者: 莫映 / 王开福 出版社: 电子工业出版社 2009 - 1
本书以机器学习与计算统计为主题背景,专门讲述如何挖掘和分析Web上的数据和资源,如何分析用户体验、市场营销、个人品味等诸多信息,并得出有用的结论,通过复杂的算法来从Web网站获取、收集并分析用户的数据和反馈信息,以便创造新的用户价值和商业价值。全书内容翔实,包括协作过滤技术(实现关联产品推荐功能)、集群数据分析(在大规模数据集中发掘相似的数据子集)、搜索引擎核心技术(爬虫、索引、查询引擎、PageRank算法等)、搜索海量信息并进行分析统计得出结论的优化算法、贝叶斯过滤技术(垃圾邮件过滤、文本过滤)、用决策树技术实现预测和决策建模功能、社交网络的信息匹配技术、机器学习和人工智能应用等。
本书是Web开发者、架构师、应用工程师等的绝佳选择。
The Master Algorithm 豆瓣
作者: Pedro Domingos 出版社: Basic Books 2015 - 9
A thought-provoking and wide-ranging exploration of machine learning and the race to build computer intelligences as flexible as our own
In the world's top research labs and universities, the race is on to invent the ultimate learning algorithm: one capable of discovering any knowledge from data, and doing anything we want, before we even ask. In The Master Algorithm, Pedro Domingos lifts the veil to give us a peek inside the learning machines that power Google, Amazon, and your smartphone. He assembles a blueprint for the future universal learner--the Master Algorithm--and discusses what it will mean for business, science, and society. If data-ism is today's philosophy, this book is its bible.
数据挖掘中的新方法:支持向量机 豆瓣
作者: 邓乃扬 / 田英杰 出版社: 科学出版社 2004 - 6
支持向量机是数据挖掘中的一个新方法。支持向量机能非常成功地处理回归问题(时间序列分析)和模式识别(分类问题、判别分析)等诸多问题,并可推广于预测和综合评价等领域,因此可应用于理科、工科和管理等多种学科。目前国际上支持向量机在理论研究和实际应用两方面都正处于飞速发展阶段。希望本书能促进它在我国的普及与提高。
本书对象既包括关心理论的研究工作者,也包括关心应用的实际工作者。对于有关领域的具有高等数学知识的实际工作者,略去书中的某些理论部分,仍能对支持向量机的本质有一个概括的理解,从而用它解决自己的问题。
本书适合高等院校高年级学生、研究生、教师和相关科研人员及相关领域的实际工作者使用。