DataScience
The Elements of Statistical Learning 豆瓣
作者: T. Hastie / R. Tibshirani Springer 2003 - 7
During the past decade there has been an explosion in computation and information technology. With it has come vast amounts of data in a variety of fields such as medicine, biology, finance, and marketing. The challenge of understanding these data has led to the development of new tools in the field of statistics, and spawned new areas such as data mining, machine learning, and bioinformatics. Many of these tools have common underpinnings but are often expressed with different terminology. This book descibes the important ideas in these areas in a common conceptual framework. While the approach is statistical, the emphasis is on concepts rather than mathematics. Many examples are given, with a liberal use of color graphics. It should be a valuable resource for statisticians and anyone interested in data mining in science or industry. The book's coverage is broad, from supervised learing (prediction) to unsupervised learning. The many topics include neural networks, support vector machines, classification trees and boosting--the first comprehensive treatment of this topic in any book. Trevor Hastie, Robert Tibshirani, and Jerome Friedman are professors of statistics at Stanford University. They are prominent researchers in this area: Hastie and Tibshirani developed generalized additive models and wrote a popular book of that title. Hastie wrote much of the statistical modeling software in S-PLUS and invented principal curves and surfaces. Tibshirani proposed the Lasso and is co-author of the very successful <EM>An Introduction to the Bootstrap</EM>. Friedman is the co-inventor of many data-mining tools including CART, MARS, and projection pursuit.
Data Science from Scratch 豆瓣
作者: Joel Grus O'Reilly Media 2015 - 4
Data science libraries, frameworks, modules, and toolkits are great for doing data science, but they’re also a good way to dive into the discipline without actually understanding data science. In this book, you’ll learn how many of the most fundamental data science tools and algorithms work by implementing them from scratch.
If you have an aptitude for mathematics and some programming skills, author Joel Grus will help you get comfortable with the math and statistics at the core of data science, and with hacking skills you need to get started as a data scientist. Today’s messy glut of data holds answers to questions no one’s even thought to ask. This book provides you with the know-how to dig those answers out.
Get a crash course in Python
Learn the basics of linear algebra, statistics, and probability—and understand how and when they're used in data science
Collect, explore, clean, munge, and manipulate data
Dive into the fundamentals of machine learning
Implement models such as k-nearest Neighbors, Naive Bayes, linear and logistic regression, decision trees, neural networks, and clustering
Explore recommender systems, natural language processing, network analysis, MapReduce, and databases
Data Science for Business 豆瓣
作者: Foster Provost / Tom Fawcett O'Reilly Media 2013 - 8
Review
"A must-read resource for anyone who is serious about embracing the opportunity of big data."
-- Craig Vaughan
Global Vice President at SAP
"This book goes beyond data analytics 101. It's the essential guide for those of us (all of us?) whose businesses are built on the ubiquity of data opportunities and the new mandate for data-driven decision-making."
--Tom Phillips
CEO of Media6Degrees and Former Head of Google Search and Analytics
"Data is the foundation of new waves of productivity growth, innovation, and richer customer insight. Only recently viewed broadly as a source of competitive advantage, dealing well with data is rapidly becoming table stakes to stay in the game. The authors' deep applied experience makes this a must read--a window into your competitor's strategy."
-- Alan Murray
Serial Entrepreneur; Partner at Coriolis Ventures
"This timely book says out loud what has finally become apparent: in the modern world, Data is Business, and you can no longer think business without thinking data. Read this book and you will understand the Science behind thinking data."
-- Ron Bekkerman
Chief Data Officer at Carmel Ventures
"A great book for business managers who lead or interact with data scientists, who wish to better understand the principles and algorithms available without the technical details of single-disciplinary books."
-- Ronny Kohavi
Partner Architect at Microsoft Online Services Division
About the Author
Foster Provost is Professor and NEC Faculty Fellow at the NYU Stern School of Business where he teaches in the MBA, Business Analytics, and Data Science programs. His award-winning research is read and cited broadly. Prof. Provost has co-founded several successful companies focusing on data science for marketing.
Tom Fawcett holds a Ph.D. in machine learning and has worked in industry R&D for more than two decades for companies such as GTE Laboratories, NYNEX/Verizon Labs, and HP Labs. His published work has become standard reading in data science.