分布式系统
Kubernetes in Action, Second Edition 豆瓣
作者: Marko Lukša Manning Publications 2020 - 6
Kubernetes in Action, Second Edition teaches you to use Kubernetes to deploy container-based distributed applications. You'll start with an overview of how Docker containers work with Kubernetes and move quickly to building your first cluster. You'll gradually expand your initial application, adding features and deepening your knowledge of Kubernetes architecture and operation. In this revised and expanded second edition, you’ll take a deep dive into the structure of a Kubernetes-based application and discover how to manage a Kubernetes cluster in production. As you navigate this comprehensive guide, you'll also appreciate thorough coverage of high-value topics like monitoring, tuning, and scaling.
what's inside
Up and running with Kubernetes
Deploying containers across a cluster
Securing clusters
Updating applications with zero downtime
Hadoop: The Definitive Guide 豆瓣
作者: Tom White O'Reilly Media 2015 - 4
Get ready to unlock the power of your data. With the fourth edition of this comprehensive guide, you’ll learn how to build and maintain reliable, scalable, distributed systems with Apache Hadoop. This book is ideal for programmers looking to analyze datasets of any size, and for administrators who want to set up and run Hadoop clusters.
Using Hadoop 2 exclusively, author Tom White presents new chapters on YARN and several Hadoop-related projects such as Parquet, Flume, Crunch, and Spark. You’ll learn about recent changes to Hadoop, and explore new case studies on Hadoop’s role in healthcare systems and genomics data processing.
Learn fundamental components such as MapReduce, HDFS, and YARN
Explore MapReduce in depth, including steps for developing applications with it
Set up and maintain a Hadoop cluster running HDFS and MapReduce on YARN
Learn two data formats: Avro for data serialization and Parquet for nested data
Use data ingestion tools such as Flume (for streaming data) and Sqoop (for bulk data transfer)
Understand how high-level data processing tools like Pig, Hive, Crunch, and Spark work with Hadoop
Learn the HBase distributed database and the ZooKeeper distributed configuration service
2020年12月5日 已读
刚开始看没多少(Part I 一半不到),各种方面写得都相当清楚,不愧是基金会 member 讲自己参与设计的系统……连如何安装和配置 Hadoop cluster 写得都比垃圾官方文档详尽(…)真是高下立判啊 🤣 (Update Dec 5, 2020)跳过了关于 Pig Hive 等等 Apache 生态组件的介绍还有 Case Study。产生了已经完全掌握 Hadoop 了的错觉。不过,学了没人用的东西还真是对不起啊(半恼)
分布式系统 大数据 程序设计与开发工具 英文原版 计算机科学
分布式系统 豆瓣
Distributed Systems: Concepts and Design, Fifth Edition
作者: (英)George Coulouris / Jean Dollimore 译者: 金蓓弘 / 马应龙 机械工业出版社 2013 - 3
从移动电话到互联网,我们的生活越来越依赖于以无缝和透明的方式将计算机和其他设备链接在一起的分布式系统。本书全面介绍分布式系统的设计原理和实践及其最新进展,并使用大量最新的实例研究来阐明分布式系统的设计与开发方法。
本书前几版已被爱丁堡大学、伊利诺伊大学、卡内基-梅隆大学、南加州大学、得克萨斯A&M大学、多伦多大学、罗切斯特理工学院、北京大学等众多名校选用为教材。第5版在上一版的基础上,新增了三章内容,分别介绍间接通信、分布式对象和组件、分布式系统设计(以Google为例)。
本书网站www.cdk5.net为学生和教师提供了丰富的学习资源和教学资源(源代码、参考文献、教学幻灯片、勘误等)。
2020年12月2日 已读
這本書的英文原版寫得就足夠艱深枯燥了。中文翻譯更是慘不忍睹。雖然一部分原因是我知識水平過低,更需要綜述型的書來進行初步的概念理解。論介紹概念 Tanenbaum 的 Principles and Paradigm 有趣、易懂的多,論瞭解技術細節不如直接去看論文。不知道為什麼有如此多的擁躉(Update Dec 2, 2020)我終於是讀不下去了...這書不能看,用作 Tanenbaum ㄉ補充說明還可以。原版都不推薦,中譯更不推薦...
分布式系统 计算机科学
Hadoop Application Architectures 豆瓣
作者: Mark Grover / Ted Malaska O'Reilly Media 2015 - 4
With Early Release ebooks, you get books in their earliest form — the author's raw and unedited content as he or she writes — so you can take advantage of these technologies long before the official release of these titles. You'll also receive updates when significant changes are made, new chapters as they're written, and the final ebook bundle.
Get expert guidance on architecting end-to-end data management solutions with Apache Hadoop. While many sources explain how to use various components in the Hadoop ecosystem, this practical book takes you through architectural considerations necessary to tie those components together into a complete tailored application, based on your particular use case.
To reinforce those lessons, the book’s second section provides detailed examples of architecture used in some of the most commonly found Hadoop applications. Whether you’re designing and implementing a new Hadoop application, or planning to integrate Hadoop into your existing data infrastructure, Hadoop Application Architectures will skillfully guide you through the process.
The Early Release edition begins with chapters that concentrate on design considerations for Data Modeling and Data Movement in Hadoop:
Explore whether your application should store data on Hadoop Distributed File System (HDFS) or HBase
Get best practices for designing an HDFS or HBase schema
Learn how to design schemas for SQL-on-Hadoop (e.g. Hive, Impala, HCatalog) tables
Designing Data-Intensive Applications 豆瓣 Goodreads
9.4 (21 个评分) 作者: Martin Kleppmann O'Reilly Media 2017 - 4
Data is at the center of many challenges in system design today. Difficult issues need to be figured out, such as scalability, consistency, reliability, efficiency, and maintainability. In addition, we have an overwhelming variety of tools, including relational databases, NoSQL datastores, stream or batch processors, and message brokers. What are the right choices for your application? How do you make sense of all these buzzwords?
In this practical and comprehensive guide, author Martin Kleppmann helps you navigate this diverse landscape by examining the pros and cons of various technologies for processing and storing data. Software keeps changing, but the fundamental principles remain the same. With this book, software engineers and architects will learn how to apply those ideas in practice, and how to make full use of data in modern applications.
Peer under the hood of the systems you already use, and learn how to use and operate them more effectively
Make informed decisions by identifying the strengths and weaknesses of different tools
Navigate the trade-offs around consistency, scalability, fault tolerance, and complexity
Understand the distributed systems research upon which modern databases are built
Peek behind the scenes of major online services, and learn from their architectures