R
An Introduction to Statistical Learning 豆瓣 Goodreads
9.8 (12 个评分) 作者: Gareth James / Daniela Witten Springer 2013 - 8
An Introduction to Statistical Learning provides an accessible overview of the field of statistical learning, an essential toolset for making sense of the vast and complex data sets that have emerged in fields ranging from biology to finance to marketing to astrophysics in the past twenty years. This book presents some of the most important modeling and prediction techniques, along with relevant applications. Topics include linear regression, classification, resampling methods, shrinkage approaches, tree-based methods, support vector machines, clustering, and more. Color graphics and real-world examples are used to illustrate the methods presented. Since the goal of this textbook is to facilitate the use of these statistical learning techniques by practitioners in science, industry, and other fields, each chapter contains a tutorial on implementing the analyses and methods presented in R, an extremely popular open source statistical software platform. Two of the authors co-wrote The Elements of Statistical Learning (Hastie, Tibshirani and Friedman, 2nd edition 2009), a popular reference book for statistics and machine learning researchers. An Introduction to Statistical Learning covers many of the same topics, but at a level accessible to a much broader audience. This book is targeted at statisticians and non-statisticians alike who wish to use cutting-edge statistical learning techniques to analyze their data. The text assumes only a previous course in linear regression and no knowledge of matrix algebra.
ggplot2 豆瓣 Goodreads
8.0 (7 个评分) 作者: Hadley Wickham Springer 2009 - 8 其它标题: ggplot2: Elegant Graphics for Data Analysis
This book describes ggplot2, a new data visualization package for R that uses the insights from Leland Wilkison''s Grammar of Graphics to create a powerful and flexible system for creating data graphics. With ggplot2, it''s easy to:
* produce handsome, publication-quality plots, with automatic legends created from the plot specification
* superpose multiple layers (points, lines, maps, tiles, box plots to name a few) from different data sources, with automatically adjusted common scales
* add customisable smoothers that use the powerful modelling capabilities of R, such as loess, linear models, generalised additive models and robust regression
* save any ggplot2 plot (or part thereof) for later modification or reuse
* create custom themes that capture in-house or journal style requirements, and that can easily be applied to multiple plots
* approach your graph from a visual perspective, thinking about how each component of the data is represented on the final plot.
This book will be useful to everyone who has struggled with displaying their data in an informative and attractive way. You will need some basic knowledge of R (i.e. you should be able to get your data into R), but ggplot2 is a mini-language specifically tailored for producing graphics, and you''ll learn everything you need in the book. After reading this book you''ll be able to produce graphics customized precisely for your problems, and you''ll find it easy to get graphics out of your head and on to the screen or page.
Automated Data Collection with R 豆瓣
作者: Simon Munzert / Christian Rubba Wiley 2015 - 1
A hands on guide to web scraping and text mining for bothbeginners and experienced users of R
(1)Introduces fundamental concepts of the main architecture of theweb and databases and covers HTTP, HTML, XML, JSON, SQL.
(2)Provides basic techniques to query web documents and data sets(XPath and regular expressions).
(3)An extensive set of exercises are presented to guide thereader through each technique.
(4)Explores both supervised and unsupervised techniques as well asadvanced techniques such as data scraping and text management.
(5)Case studies are featured throughout along with examples foreach technique presented.
(6)R code and solutions to exercises featured in thebook are provided on a supporting website.