Automated Data Collection with R
豆瓣
A Practical Guide to Web Scraping and Text Mining.
Simon Munzert / Christian Rubba …
简介
A hands on guide to web scraping and text mining for bothbeginners and experienced users of R
(1)Introduces fundamental concepts of the main architecture of theweb and databases and covers HTTP, HTML, XML, JSON, SQL.
(2)Provides basic techniques to query web documents and data sets(XPath and regular expressions).
(3)An extensive set of exercises are presented to guide thereader through each technique.
(4)Explores both supervised and unsupervised techniques as well asadvanced techniques such as data scraping and text management.
(5)Case studies are featured throughout along with examples foreach technique presented.
(6)R code and solutions to exercises featured in thebook are provided on a supporting website.