统计
The Lady Tasting Tea 豆瓣
作者: David Salsburg Non Basic Stock Line 2002 - 5
At a summer tea party in Cambridge, England, a lady states that tea poured into milk tastes differently than that of milk poured into tea. Her notion is shouted down by the scientific minds of the group. But one guest, by the name Ronald Aylmer Fisher, proposes to scientifically test the lady's hypothesis. There was no better person to conduct such a test. For Fisher had brought to the field of statistics an emphasis on controlling the methods for obtaining data and the importance of interpretation. He knew that how the data was gathered and applied was as important as the data themselves.
In The Lady Tasting Tea, readers will encounter not only Ronald Fisher's theories (and their repercussions), but the ideas of dozens of men and women whose revolutionary work affects our everyday lives. Writing with verve and wit, author David Salsburg traces the rise and fall of Karl Pearson's theories, explores W. Edwards Deming's statistical methods of quality control (which rebuilt postwar Japan's economy), and relates the story of Stella Cunliff's early work on the capacity of small beer casks at the Guinness brewing factory.
The Lady Tasting Tea is not a book of dry facts and figures, but the history of great individuals who dared to look at the world in a new way.
Trustworthy Online Controlled Experiments 豆瓣
作者: Ron Kohavi / Diane Tang Cambridge University Press 2020 - 5
Getting numbers is easy; getting numbers you can trust is hard. This practical guide by experimentation leaders at Google, LinkedIn, and Microsoft will teach you how to accelerate innovation using trustworthy online controlled experiments, or A/B tests. Based on practical experiences at companies that each run more than 20,000 controlled experiments a year, the authors share examples, pitfalls, and advice for students and industry professionals getting started with experiments, plus deeper dives into advanced topics for practitioners who want to improve the way they make data-driven decisions. Learn how to • Use the scientific method to evaluate hypotheses using controlled experiments • Define key metrics and ideally an Overall Evaluation Criterion • Test for trustworthiness of the results and alert experimenters to violated assumptions • Build a scalable platform that lowers the marginal cost of experiments close to zero • Avoid pitfalls like carryover effects and Twyman's law • Understand how statistical issues play out in practice.
Probability Theory: The Logic of Science 豆瓣
作者: E. T. Jaynes Cambridge University Press 2003
The standard rules of probability can be interpreted as uniquely valid principles in logic. In this book, E. T. Jaynes dispels the imaginary distinction between 'probability theory' and 'statistical inference', leaving a logical unity and simplicity, which provides greater technical power and flexibility in applications. This book goes beyond the conventional mathematics of probability theory, viewing the subject in a wider context. New results are discussed, along with applications of probability theory to a wide variety of problems in physics, mathematics, economics, chemistry and biology. It contains many exercises and problems, and is suitable for use as a textbook on graduate level courses involving data analysis. The material is aimed at readers who are already familiar with applied mathematics at an advanced undergraduate level or higher. The book will be of interest to scientists working in any area where inference from incomplete information is necessary.
Foundations of Statistical Natural Language Processing 豆瓣
作者: Christopher D. Manning / Hinrich Schütze The MIT Press 1999 - 6
Statistical approaches to processing natural language text have become dominant in recent years. This foundational text is the first comprehensive introduction to statistical natural language processing (NLP) to appear. The book contains all the theory and algorithms needed for building NLP tools. It provides broad but rigorous coverage of mathematical and linguistic foundations, as well as detailed discussion of statistical methods, allowing students and researchers to construct their own implementations. The book covers collocation finding, word sense disambiguation, probabilistic parsing, information retrieval, and other applications.
Statistical Rethinking 豆瓣
作者: Richard McElreath Chapman and Hall/CRC 2015
Statistical Rethinking: A Bayesian Course with Examples in R and Stan builds readers’ knowledge of and confidence in statistical modeling. Reflecting the need for even minor programming in today’s model-based statistics, the book pushes readers to perform step-by-step calculations that are usually automated. This unique computational approach ensures that readers understand enough of the details to make reasonable choices and interpretations in their own modeling work.
The text presents generalized linear multilevel models from a Bayesian perspective, relying on a simple logical interpretation of Bayesian probability and maximum entropy. It covers from the basics of regression to multilevel models. The author also discusses measurement error, missing data, and Gaussian process models for spatial and network autocorrelation.
By using complete R code examples throughout, this book provides a practical foundation for performing statistical inference. Designed for both PhD students and seasoned professionals in the natural and social sciences, it prepares them for more advanced or specialized statistical modeling.
The Art of Statistics 豆瓣
作者: David Spiegelhalter Pelican 2019 - 3
Statistics has played a leading role in our scientific understanding of the world for centuries, yet we are all familiar with the way statistical claims can be sensationalised, particularly in the media. In the age of big data, as data science becomes established as a discipline, a basic grasp of statistical literacy is more important than ever.
In The Art of Statistics, David Spiegelhalter guides the reader through the essential principles we need in order to derive knowledge from data. Drawing on real world problems to introduce conceptual issues, he shows us how statistics can help us determine the luckiest passenger on the Titanic, whether serial killer Harold Shipman could have been caught earlier, and if screening for ovarian cancer is beneficial.
How many trees are there on the planet? Do busier hospitals have higher survival rates? Why do old men have big ears? Spiegelhalter reveals the answers to these and many other questions - questions that can only be addressed using statistical science.
我国20个统计指标的历史变迁 豆瓣
作者: 国家统计局编写组 2017 - 8
统计指标的变迁,可以从一个侧面反映出统计工作的变迁,进而反映出整个经济社会的变迁。本书精选了国家统计局生产、发布的20个核心指标,涉及14个专业领域;每个指标的变迁内容大致由变迁简史、变迁亲历、变迁图谱和历年数据四部分组成。为大家了解和研究新中国统计发展史提供一个独特视角,更为大家正确理解和使用统计指标提供切实方便和有益帮助。
基本有用的计量经济学 豆瓣
作者: 赵西亮 北京大学出版社 2017 - 7
《基本有用的计量经济学》主要从因果推断的基本思想出发,详细介绍Rubin潜在结果框架、随机化实验、匹配方法、回归方法、工具变量法、倍差法、断点回归法等现代经验分析方法,对从事社会科学、统计学、医学统计等领域的学生或学者提供重要的因果推断工具。在计量经济学应用模型中,本书着重讨论了模型类型选择、模型变量选择、模型函数关系设定和模型变量性质设定的原则和方法。在详细介绍线性回归模型的数学过程的基础上,各章的重点不是理论方法的数学推导与证明,而是对实际应用中出现的实际问题的处理,并尽可能与中国的模型实例相结合。
本书适合作为高等院校经济、管理学科本科生和硕士研究生的教材或教学参考书,也可供具有一定数学、经济学和经济统计学基础的经济管理和研究人员阅读和参考。
A Probability Path 豆瓣
作者: Sidney Resnick Birkhäuser 1999 - 10
Many probability books are written by mathematicians and have the built in bias that the reader is assumed to be a mathematician coming to the material for its beauty. This textbook is geared towards beginning graduate students from a variety of disciplines whose primary focus is not necessarily mathematics for its own sake. Instead, A Probability Path is designed for those requiring a deep understanding of advanced probability for their research in statistics, applied probability, biology, operations research, mathematical finance, and engineering.
R语言经典实例 豆瓣 Goodreads
R Cookbook
作者: Paul Teetor 译者: 李洪成 / 朱文佳 机械工业出版社 2013 - 5
【编辑推荐】
“本书不仅是一本解决方案手册,也提供了一种真正令人愉悦的学习R的方法——每次给出一个实际的例子,非常容易阅读!”
——Jeffrey Ryan 软件咨询专家和R添加包作者
“带着95%的信心,我不能拒绝 ‘本书是学习、应用R中的统计功能的最好的教材’这一结论。”
—— JD Long CerebralMastication.com上的R博客作者
【内容简介】
本书涵盖200多个R语言实用方法,可以帮助读者快速而有效地使用R进行数据分析。R语言给我们提供了统计分析的一切工具,但是R本身的结构可能有些难于掌握。本书提供的这些面向任务、简明的R语言方法包含了从基本的分析任务到输入和输出、常用统计分析、绘图、线性回归等内容,它们可以让你马上应用R高效地工作。
每一个R语言方法都专注于一个特定的问题,随后的讨论则对问题的解决方案给出解释,并阐释该方法的工作机理。对于R的初级用户,本书将帮助你步入R的殿堂;对于R的资深用户,本书将加深你对R的理解并拓展你的视野。通过本书,你可以使你的分析工作顺利完成并学习更多R语言知识。
本书主要内容:
■ 建立向量,处理变量,以及执行其他基本函数。
■ 数据的输入和输出。
■ 处理矩阵、列表、因子和数据框等数据结构。
■ 分析概率、概率分布和随机变量。
■ 计算统计量和置信区间,进行统计检验。
■ 创建各种图形。
■ 构建线性回归和方差分析(ANOVA)等统计模型。
■ 探索高级统计技术,如聚类分析等。
数据挖掘技术 豆瓣
Data Mining Techniques:For Marketing,Sales,and Customer Relationship Management, Third Edition
作者: Gordon S.Linoff / Michael J.A. Berry 译者: 巢文涵 / 张小明 2013 - 3
《数据挖掘技术:应用于市场营销、销售与客户关系管理(第3版)》内容简介:谁将是忠实的客户?谁将不是呢?哪些消息对哪些客户细分最有效?如何最大化客户的价值?如何将客户的价值最大化?《数据挖掘技术:应用于市场营销、销售与客户关系管理(第3版)》提供了强大的工具,可以从上述和其他重要商业问题所在的公司数据库中提取它们的答案。自《数据挖掘技术:应用于市场营销、销售与客户关系管理(第3版)》第1版问世以来,数据挖掘已经日益成为现代商业不可缺少的工具。在这个最新版本中,作者对每个章节都进行了大量的更新和修订,并且添加了几个新的章节。《数据挖掘技术:应用于市场营销、销售与客户关系管理(第3版)》保留了早期版本的重点,指导市场分析师、业务经理和数据挖掘专家利用数据挖掘方法和技术来解决重要的商业问题。在不牺牲准确度的前提下,为了简单起见,即使是复杂的主题,作者也进行了简洁明了的介绍,并尽量减少对技术术语或数学公式的使用。每个技术主题都通过案例研究和源自作者经验的真实案例进行说明,每章都包含了针对从业者的宝贵提示。书中介绍的新技术和更为深入的技术包括:线性和逻辑回归模型、增量响应(提升)建模、朴素贝叶斯模型、表查询模型、相似度模型、径向基函数网络、期望值最大化(EM)聚类和群体智慧。新的章节专门讨论了数据准备、派生变量、主成分分析和其他变量减少技术,以及文本挖掘。
在建立了全面的数据挖掘应用业务环境,并介绍了所有数据挖掘项目通用的数据挖掘方法论的各个方面之后,《数据挖掘技术:应用于市场营销、销售与客户关系管理(第3版)》详细介绍了每个重要的数据挖掘技术。