PySpark in Action

Name: PySpark in Action
ISBN: 9781617297205

豆瓣谷歌图书 Goodreads

ISBN: 9781617297205

作者: Jonathan Rioux

出版社: Manning Publications

发行时间: 2020 -10

语言: 英语

装订: Paperback

价格: USD 49.99

页数: 425

/ 10

0 个评分

评分人数不足

借阅或购买

WorldCat

Open Library

OAPEN

Bookshop.org

Amazon DE JP UK

Kobo JP TW US

多抓鱼孔夫子旧书

博客来 Readmoo 讀墨

Python data analysis at scale

Jonathan Rioux

简介

PySpark in Action is a carefully engineered tutorial that helps you use PySpark to deliver your data-driven applications at any scale. This clear and hands-on guide shows you how to enlarge your processing capabilities across multiple machines with data from any source, ranging from Hadoop-based clusters to Excel worksheets. You’ll learn how to break down big analysis tasks into manageable chunks and how to choose and use the best PySpark data abstraction for your unique needs. By the time you’re done, you’ll be able to write and run incredibly fast PySpark programs that are scalable, efficient to operate, and easy to debug.
what's inside
Packaging your PySpark code
Managing your data as it scales across multiple machines
Re-writing Pandas, R, and SAS jobs in PySpark
Troubleshooting common data pipeline problems
Creating reliable long-running jobs

显示更多

PySpark in Action

/ 10

简介

短评

评论

笔记