Spark in Action, Second Edition

豆瓣
Spark in Action, Second Edition

登录后可管理标记收藏。

ISBN: 9781617295522
作者: Jean-Georges Perrin
出版社: Manning Publications
发行时间: 2020 -4
装订: Paperback
价格: USD 59.99
页数: 565

/ 10

0 个评分

评分人数不足
借阅或购买

Jean-Georges Perrin   

简介

Spark in Action, Second Edition is an entirely new book that teaches you everything you need to create end-to-end analytics pipelines in Spark. Rewritten from the ground up with lots of helpful graphics, you’ll learn the roles of DAGs and dataframes, the advantages of “lazy evaluation”, and ingestion from files, databases, and streams.
By working through carefully-designed Java-based examples, you’ll delve into Spark SQL, interface with Python, and cache and checkpoint your data. Along the way, you’ll learn to interact with common enterprise data technologies like HDFS and file formats like Parquet, ORC, and Avro.
You’ll also discover interesting Spark use cases, like interactive reporting, machine learning pipelines, and even monitoring players in online games. You’ll even get a quick look at machine learning techniques you can apply without a PhD in mathematics! All examples are available in GitHub for you to explore and adapt as you learn. The demand for Spark-savvy developers is so steep, they’re among the highest paid in the industry today!
what's inside
Lots of examples based in the Spark Java APIs using real-life dataset and scenarios
Examples based on Spark v2.3 Ingestion through files, databases, and streaming
Building custom ingestion process
Querying distributed datasets with Spark SQL
Deploying Spark applications
Caching and checkpointing your data
Interfacing with data scientists using Python
Applied machine learning
Spark use cases including Lumeris, CERN, and IBM

其它版本 (1)
短评
评论
笔记